Sorry for replying to an ancient thread, but I just stumbled into this while examining sidplayfp and libsidplayfp and its Songlengths.txt database support. There's a lot of misinformation in this thread, unfortunately.
Looking up song lengths couldn't be easier: Calculate MD5 sum for the file, find the line with the MD5 sum in question, and parse the song lengths following the MD5 sum.So let's see.... I take a MD5 sum of any of the files that comes with HVSC... and I can't find a match from file.
Wrong approach. You cannot take the MD5 fingerprint of the entire file for various reasons. First of all, in their regular updates the HVSC maintainers have altered the metadata in the header of .sid files while keeping the actual machine code and data fragment unchanged. Any such update would have invalidated the fingerprint of the file. Additionally, if they altered the speed of a tune via a flag in the file header, that should invalidate the fingerprint. Same if they shuffled the order of multiple songs in a file or if they altered the INIT and PLAY vectors of the machine code to fix a ripped sidtune so that it would initialise correctly and play differently -- yes, sometimes not just seconds but minutes have been missing due to wrong initialisation.
Ah, but I forgot that curious comment from the documentation: "As of HVSC 5.0, a modified fingerprint calculation is required."
At that time the database has been experimental. Highly experimental even.
Modified fingerprint calculation? Um, that doesn't sound good - especially when the said "modified fingerprint calculation" isn't explained anywhere in this file.
But it is explained - in source code.
Great! That would have been a perfect opportunity to join and contribute ideas or actual improvements even. It's been an experimental project.
While I'm waiting for the headache to pass, let me babble and get this straight: I'm supposed to gulp up the file and parse the file header. Then I'm supposed to take a MD5 sum of the file data plus slightly differently formatted header. And apparently, if you choose to use NTSC timer instead of PAL timer - a player setting, mind you - out comes a completely different MD5 sum!
Of course! Would you have preferred if all player front-ends needed to compensate for changes in PAL vs. NTSC and VBI vs. CIA Timer IRQ? That would have been annoying, even for an experimental database.
The Songlengths.txt maintainers obviously realise that "ChunkOfWorthlessUndecipherableLineNoise = Random song lengths..." is not exactly the most readable format in the world and gives little information for a casual reader of the file. Therefore, in their infinite wisdom (and I'm not sarcastic here or anything, honest),
Wow! You're clueless and overly negative here at the same time. More below.
each song length entry is preceded by a completely innocuous comment text that tells the relative path to the SID file in question... just for the benefit of the reader, you know, if you need to look up the song lengths by hand...
It's great that the INI file is human-readable, too. The only alternative for a highly experimental project would have been to provide a full set of tools to read and edit the INI file for manual lookups. Being able to quickly search it manually and take a look at what songlengths (and other details) are stored for a sid file is a great feature. Also don't forget that the original Songlengths.txt file contains diagnostic output because the song-length detector has been experimental.
The player software is obviously expected to ignore all such rubbishy comment lines. I mean, they're comment lines?
Lines starting with ';' in INI files certainly are comments and are to be ignored. A file may move to a different path any time, but its MD5 fingerprint could stay unchanged.
Surely calculating MD5 sums is faster, more efficient, and more reliable than relying on oft-changing relative file paths? (Okay, now I'm sarcastic.)
Clueless once again. SID music fans like to copy files to their private collection. Then all that's left is a sidtune's MD5 fingerprint, because not even the relative file path can be used to identify a file. Ignoring the MD5 fingerprints (the actual key/value pairs in the INI file) and relying on the paths in the comments is just a really bad idea and much too fragile.
My logic goes like this: If such interesting information can be
optionally stored in Songlengths.txt, why couldn't it be stored
somewhere a bit closer, like in, ah, I don't know, .sid header? =)
Easy to answer. The PSID file format predates this era and is not an extensible header (unlike the SIDPLAY INFOFILE format). Existing players would not have been able to understand a new file format, if it had been published. The later PSID v2NG format may have taken the chance to add fields for song lengths. For quite some time, there has been activity related to developing a successor to PSID (even XML based suggestions), but with no outcome. PSID v2NG and RSID formats are just a compromise.
@asuffield said:
Incidentally,
the reason why these files don't have stored lengths is that the
question is not meaningful. A SID file is actually a program. It does
not have to have a single length, or even terminate at all.
True. As such, there is no known "duration" to begin with. It's not defined anywhere. However, while you are correct that many sidtunes contain a program (the music player) that runs forever (or is based on an interrupt handler that is called again and again), its output often stops with silence or with a restart to the beginning (sometimes even by reinitialising the player). Silence and loops is what the song-length detector has tried to determine. Highly experimental, of course. ;)
@magetoo said:
(That said, clearly sometimes
it's just wrong. Like Skate or Die. Rob Hubbard, living and working
in the UK, obviously made the song for 50 Hz playback. Yet, last I
checked (a long time ago), the default setting for that song in the
HVSC was 60 Hz.
Perhaps it was released like that in the US, and the PAL versions adjusted accordingly,
but all the cracked versions I ever saw played it back on a vblank
interrupt, at 50 Hz. Like it should be.) :-)
Questionable. The HVSC (with some core members being from the UK and with good contact to Rob Hubbard even) has explicitly corrected the speed of Skate or Die in Update #16, see the included Update files, pointing out that many crackers have played it back at the wrong speed. I would trust the HVSC here.