11
votes

We have to save the day and the month of a yearly reoccurring event. (Think of it as the day of they year when the invoice has to be sent to the client.)

We will have to do some computation with this information. E.g if the customer is paying twice per year. Then we do not only have to check if this day is today but also if this day is in 6 month.

We have now several options: a) We save the information in a DATE field but ignore the year. b) We create two int fields in the database date_day and date_month. c) We create a varchar field and then do a regex and split e.g. 31.12 up every time we do some calculation.

We did several tests and found out that c) is definitely too slow. But we still have the option a) and b). First we wanted to go for b) but after some tests we tend more to a).

Is there a good technical reason that would really make one option a lot better than the other?

(We are using MySQL at the moment. If this is of importance.)

8

8 Answers

10
votes

I'd choose b), because it more accurately reflects the meaning of the data. Having a data structure where some parts are supposed to be "ignored" is problematic. What if people just do a simple date comparison, assuming the year is always the same, but someone used a different placeholder year, assuming the year doesn't matter anyway?

Always use data structures that reflect your intentions.

That c) is bad is, I believe, beyond reasonable discussion :-). And I'm not thinking of performance reasons...

2
votes

I would use the date field and even though not needed, would still save the year as well. Just strip it out when you have to print it / use it. There are a few reasons:

  1. You might find out that a later point the customer does want you to save the date. In that case you don't have to do any changes to your database structure.
  2. You can use the SQL date functions to compare dates, if needed. If you have day and month in separate fields, you need a lot more code to e.g. calculate the difference between two dates (leap years etc).

The reasons give for choosing b) can also be easily solved with those SQL date functions. You can easily pick events in a certain month, for instance, in a single query.

1
votes

I'd choose b), because it would make queries much easier: you'll be able to restore all events in a range (December, specific day, day range) in a very easy manner. If you choose a) - don't forget to set the year to a specific one for comparison and extraction reasons.

1
votes

I'd store the date of the first event, and then an interval for each subsequent event, kinda like most calendar apps. In this case, you'd structure it like this:

 first_event | interval | interval_unit
-------------+--------------------------
  2009-01-01 |        6 | 'month'
  2009-02-01 |        1 | 'year'

Unfortunately, MySQL doesn't have an INTERVAL datatype, so two columns and a bit of post-processing will be necessary, but I think it's the most flexible way to approach the problem.

1
votes

I also faced the same problem ,in my case i need to retrieve data based on a specific day in a month which is repeating yearly .I used 'DATE' and query like this

SELECT * FROM test_table WHERE MONTH(date) = 1 AND DAY(date) = 14

result like this

enter image description here

Advantage

  1. I can use the capabilities of MySQL.
  2. Decrease client side calculations.
  3. Can use date_field for other calculations

My suggestion is to use this way

this might be helpful to someone

0
votes

I'd also side with b), but use TINYINT for month (0 to 255) and SMALLINT (-32,768 to 32,767) for year to save a bit of space.

0
votes

Choose ONE int field like 1601 for January, 1st.

0
votes

I'd go for the day-of-year number (e.g. a single number from 0 to 365, and then add that to the 1st of january of the particular year you are interested in.

If you don't want to do the extra math of the above solution, then use two fields one for month and one for day (but make sure you update both when you need to!).

Remember, you have to deal with leap years, so using a date field is a bad idea since you'd have to store dates with two years - one a leap year and one not - very complicated!