My train be be leaving at... interesting times

stewieatb

Puzzle Time!

Just got this email from thetrainline.com confirming my trip for the weekend. the actual departure time for the train is 16:38, but the challenge is, what exactly did their automated system to write these messages do to get such a number? 0.693055556? 16/38 = 0.42105, 22/38 = 0.5789. i am stumped.

brightemo

0.693055556 is the fraction of the day that 16:38 is at.

24 x 0.693055556 = 16.633333344

0.633333344 * 60 = 38.00000064

too_many_usernames

Maybe it's the fraction through a day, if a day was 1.00 units long?

Nahh.....

EDIT: Curses! That other post wasn't there when I started!

paulc

16:39 is (16*60 + 39) = 999 minutes after midnight. There are 24*60 = 1440 minutes in a day. 999/1440 =0.6937 which is pretty close. Probably the slight difference is due to rounding errors in someone's custom-written Enterprise Division Library.

stewieatb

Re: My train may be leaving at... interesting times

@paulc said:

16:39 is (1660 + 39) = 999 minutes after midnight. There are 2460 = 1440 minutes in a day. 999/1440 =0.6937 which is pretty close. Probably the slight difference is due to rounding errors in someone's custom-written Enterprise Division Library.

departure time is 16:3[b]8[/b], 998/1440 = 0.693055555 recurring. No enterpriseyness today. Thanks for the answer, im not rightly awake these days. Now i see how this happened; it seems they're using a backend time value to increase the sorting speed, and a frontend one to tell the user. Makes sense, will actually improve speed. Either somebody wrote an SQL query pulling the wrong time value into the email script, or maybe a stored procedure got changed when nobody was looking. Not clever, but barely a wtf in retrospect. Still, a little odd at first sight.

belgariontheking

@stewieatb said:

Now i see how this happened; it seems they're using a backend time value to increase the sorting speed, and a frontend one to tell the user. Makes sense, will actually improve speed.

This doesn't make sense to me. I've never designed an RDBMS, but I would put an integer (or long) value into a date column, then display it as a date/time when SELECTed. Thus, there would be no reason for a developer to do something insane like this.

morbiuswilters

@belgariontheking said:

@stewieatb said:
Now i see how this happened; it seems they're using a backend time value to increase the sorting speed, and a frontend one to tell the user. Makes sense, will actually improve speed.
This doesn't make sense to me. I've never designed an RDBMS, but I would put an integer (or long) value into a date column, then display it as a date/time when SELECTed. Thus, there would be no reason for a developer to do something insane like this.

It doesn't make any sense to me either. I usually store times as unix timestamps and I just don't see how it is possible to ever end up with a decimal representation of time like this.

stewieatb

@belgariontheking said:

@stewieatb said:
Now i see how this happened; it seems they're using a backend time value to increase the sorting speed, and a frontend one to tell the user. Makes sense, will actually improve speed.
This doesn't make sense to me. I've never designed an RDBMS, but I would put an integer (or long) value into a date column, then display it as a date/time when SELECTed. Thus, there would be no reason for a developer to do something insane like this.

Sorry, i wasn't very clear there, but i think you got the idea. What i meant was, have a DepTime_1 column of decimal/float type, then a DepTime_2 column of time in hh:mm. If the records need to be sorted so that the emails for the day are prioritised and grouped by train, then sort by the DepTime_1 column, but send the value in DepTime_2 to the user in the email. I think you mean to have every time as a four-digit integer, then insert a colon in the middle by splicing and concetenation in the email sending script, which is a much better idea than what i think was done here. I guessed that hh:mm time values might be slow to sort, but in retrospect, this is more true of dates in a non-YYYYMMDD type format. hh:mm should (I think) sort just as fast as any other 4-digit integer. Maybe this is an assumption the original designer made without thinking about it, or asking anyone, and no-one has ever bothered to fix it. Or maybe it's just too entrenched in the architecture to fix.

An alternative explanation is that this emailing system was added onto the original booking system and is pulling its times out of a table for something else, say, ticket printing, or billing, which due to some ungodly design pattern uses a fraction of the day instead of hh:mm time. I would love to see the code that turns decimals into hh:mm time. But this still leaves the question of why the original architects chose this decimal time for their system.

dtech

@stewieatb said:

@belgariontheking said:
@stewieatb said:
Now i see how this happened; it seems they're using a backend time value to increase the sorting speed, and a frontend one to tell the user. Makes sense, will actually improve speed.
This doesn't make sense to me. I've never designed an RDBMS, but I would put an integer (or long) value into a date column, then display it as a date/time when SELECTed. Thus, there would be no reason for a developer to do something insane like this.
Sorry, i wasn't very clear there, but i think you got the idea. ...snip... But this still leaves the question of why the original architects chose this decimal time for their system.

I cant make much sense out of it, but what's wrong with storing the info as a DATETIME and:

..DATE_FORMAT(departure,'%k:%i') AS depTime...

...ORDER BY departure DESC..

@stewieatb said:

hh:mm should (I think) sort just as fast as any other 4-digit integer.

It won't. The difference probably won't be noticable, but a varchar sorts slower than an int.

belgariontheking

@stewieatb said:

I think you mean to have every time as a four-digit integer, then insert a colon in the middle by splicing and concetenation in the email sending script

What I was going for was the UNIX timestamp that morb mentioned.

My bet is that the rdbms is implemented in this (a sane) way, but the developer didn't know that and developed the db and app with this insanity anyway. There's nothing quite like when people think they're smarter than the database.

Qwerty

Just a formatting error. It is a WTF that it got into a printout (don't they check these things?) but the mathematics isn't unusual.

I first noticed that Excel's internal representation of date-times uses 1.00 as a day. This isn't just a Microsoft thing though. Oracle does it too. If you cast a (date-time) 16:38 into a decimal representation then 0.693... is the result. It makes date arithmetic very easy, for example: how many days between now and when your train leaves? "1.5 days" might be a good format in some displays.

You can then create even more subtle errors - if you subtract two date-times and format it with the "obvious" "dd" format, then you might find 43 days is displayed as "12" (the 12th of February is the 43rd day of the year 1900.)

bjolling

@dtech said:

<snip>
I cant make much sense out of it, but what's wrong with storing the info as a DATETIME and:
..DATE_FORMAT(departure,'%k:%i') AS depTime...

...ORDER BY departure DESC..

I definitely agree to store the info as a DATETIME

If this query only exists to show the results on a web page I could agree with the "DATE_FORMAT( )" function call in your SELECT, but generally I advise to store and retrieve dates using parameter binding. After the select you will end up with the correct DateTime object without the need for the manual "date as a string" parsing that is a big source of WTF's if I can judge by the submissions in the Sidebar.

What I'm saying: SQL Statements are about the data. The UI is about formatting.

belgariontheking

@Qwerty said:

Just a formatting error. It is a WTF that it got into a printout (don't they check these things?) but the mathematics isn't unusual.

I first noticed that Excel's internal representation of date-times uses 1.00 as a day.

I'm resisting the urge to Godwin myself, so I'll just say

<sarcasm>"Excel did it, it must be okay!"</sarcasm>

PSWorx

@morbiuswilters said:

It doesn't make any sense to me either. I usually store times as unix timestamps and I just don't see how it is possible to ever end up with a decimal representation of time like this.

I think Unix timestamps or DATETIMEs are a a bit bloaty here, since you just have a generic time table. You're not interested in a concrete date, nor will you probably need a higher resolution than minutes. So, I'd guess, the most efficient way would be to store the minutes after midnight.

Of course that doesn't make their design any less of a WTF... but hey, if you ever need a train scheduled in microseconds, that wouldn't be much of a problem at least...

moogal

@PSWorx said:

I think Unix timestamps or DATETIMEs are a a bit bloaty here, since you just have a generic time table. You're not interested in a concrete date, nor will you probably need a higher resolution than minutes. So, I'd guess, the most efficient way would be to store the minutes after midnight.
Of course that doesn't make their design any less of a WTF... but hey, if you ever need a train scheduled in microseconds, that wouldn't be much of a problem at least...

Except that you are interested in a concrete date - it's a booking for the 16:38 train on the 16th of January, 2009, rather than just a list of what trains run at what time daily.

PSWorx

@moogal said:

@PSWorx said:
I think Unix timestamps or DATETIMEs are a a bit bloaty here, since you just have a generic time table. You're not interested in a concrete date, nor will you probably need a higher resolution than minutes. So, I'd guess, the most efficient way would be to store the minutes after midnight.
Of course that doesn't make their design any less of a WTF... but hey, if you ever need a train scheduled in microseconds, that wouldn't be much of a problem at least...
Except that you are interested in a concrete date - it's a booking for the 16:38 train on the 16th of January, 2009, rather than just a list of what trains run at what time daily.

But would you actually store the departure time seperately for each booking if you already know that there probably will be several hundred bookings but only a few dozen different departure times (one for each train)?

I'm not so insane that I'd make a seperate table for the departure times to celebrate premature optimisation. But I'd guess that in a sane system, you already had a table with records for each train of they day (containing line number, platform, number of cars, etc... and also the time of day the train would arrive). Then in the bookings table, you'd just need a reference to the train you've booked along with the day. So then there would be no need to store both together.

bjolling

@PSWorx said:

You're not interested in a concrete date, nor will you probably need a higher resolution than minutes. So, I'd guess, the most efficient way would be to store the minutes after midnight.

Disk space is cheap. In your solution you'd need 2 bytes but a smalldatetime uses 4 bytes and a regular datetime uses 8 bytes.

As you said, we don't need the resolution of a datetime, so I choose a smalldatetime. Your solution saves 2 bytes per arrival and departure time compared to the smalldatetime. Suppose there are 1 million arrival and departure times in the system, you save 2 megabytes.

It's not worth the hassle

dsckeld

@PSWorx said:

But would you actually store the departure time seperately for each booking if you already know that there probably will be several hundred bookings but only a few dozen different departure times (one for each train)?

I don't see anything in the OP that suggests anything out of the ordinary regarding the database design. There is nothing to reveal that they didn't join a train/departure table on some departure ID somewhere.

How is that proverb about assumptions again?

moogal

@PSWorx said:

I'm not so insane that I'd make a seperate table for the departure times to celebrate premature optimisation. But I'd guess that in a sane system, you already had a table with records for each train of they day (containing line number, platform, number of cars, etc... and also the time of day the train would arrive). Then in the bookings table, you'd just need a reference to the train you've booked along with the day. So then there would be no need to store both together.

Quite - although the Trainline isn't necessarily a "sane" system - this is the site where looking at trains on a Wednesday afternoon, clicking "earlier" occasionally jumps you to Monday morning... (despite there being a daily half-hourly service)

PSWorx

@dsckeld said:

I don't see anything in the OP that suggests anything out of the ordinary regarding the database design. There is nothing to reveal that they didn't join a train/departure table on some departure ID somewhere.

How is that proverb about assumptions again?

The system of the OP might actually even work that way - the time is stored in a twisted way, but it actually does encode a 24 hour range, not a unique timestamp. I was more referring to the suggestions of the other posters, not the original system.

@moogal said:

Quite - although the Trainline isn't necessarily a "sane" system - this is the site where looking at trains on a Wednesday afternoon, clicking "earlier" occasionally jumps you to Monday morning... (despite there being a daily half-hourly service)

Then again... yeah, disregard what I just said...

PSWorx

@bjolling said:

As you said, we don't need the resolution of a datetime, so I choose a smalldatetime. Your solution saves 2 bytes per arrival and departure time compared to the smalldatetime. Suppose there are 1 million arrival and departure times in the system, you save 2 megabytes.
It's not worth the hassle

Well, yeah, both approaches are probably equally good and equally bad. But I can't see where you'd have *more* hassle with an integer than with a smalldatetime (provided you just stay within the boundaries of a single day)

Any way, you're right, I'm probably counting beans.

LoztInSpace

Re: My stock may be quoted at... interesting times

Here's another interesting time I conincidentally just noticed and wondering whether to post. In the very likely event the picture does not come out, it is a market price chart as at "16:04 am" courtesy of buyandhold.com

The other WTF is the ridiculous month/day/year date format of course.

@LoztInSpace said:

The other WTF is the ridiculous month/day/year date format of course.

I don't know if your just trolling or not, but that is the standard date format in the United States (which is where the NYSE is).

[b]LoztInSpace[/b]: you linked to the live chart. It is currently showing 10:42 AM.

@tster said:

I don't know if your just trolling or not, but that is the standard date format in the United States (which is where the NYSE is).

It's still ridiculous. ;=]