Wednesday, July 7, 2010

The datediff function

Next week we shall roll-out an application which provided some very interesting insights for us; it will be the core of the renewed application portfolio for one of our customers, and after about one year of hard work we all think everyone is going to be very satisfied.

The problem is that the database we are going to migrate contains very dirty data, e.g. it contains data about events that last thousands of days instead of the typical maximum, which is assumed to be five, so we're driving a hard bargain to force the customer to fix all (at least, most of) the anomalies.

A practical way to point out the errors to the customer is the datediff function:

SELECT event_id, start_date, end_date
FROM events
WHERE datediff(d, start_date, end_date) + 1 > 5

The +1 is needed as passing the third parameter equal to the second one, as in events that only last one day, would yield a 0, and I think it is clearer to leave the lower limit (5, in this case) clearly visible. Of course also

WHERE datediff(d, start_date, end_date) > 4

or

WHERE datediff(d, start_date, end_date) >= 5

or even

WHERE datediff(d, start_date, end_date) > 5 - 1

would work, but IMHO they are not as clear as the first one.

No comments: