I recently discovered the most WTF file format ever: Vocaloid's VSQ file format.
To give you a glimpse on how it works, here is a section of such a file, as
decoded by some Perl module (so the events you see map directly to MIDI
events):
MIDI::Opus->new({
'format' => 1,
'ticks' => 480,
'tracks' => [ # 2 tracks...
# Track #0 ...
MIDI::Track->new({
'type' => 'MTrk',
'events' => [ # 4 events.
['track_name', 0, 'Master Track'],
['set_tempo', 0, 500000],
['time_signature', 0, 4, 2, 24, 8],
['set_tempo', 7765, 674157],
]
}),
Okay, until here everything is sane. The typical mostly empty track 0 that only
contains a name and global metadata (tempo, etc.).
# Track #1 ...
MIDI::Track->new({
'type' => 'MTrk',
'events' => [ # 25902 events.
['track_name', 0, 'Voice1'],
['text_event', 0, "DM:0000:[Common]\x0aVersion=DSB301\x0aName=Voice1\x0aColor=181,162,123\x0aDynamicsMode=1\x0aPlayMode=1\x0a[Master]\x0aPreMeasure=4\x0a[Mixer]\x0aMasterFed"],
['text_event', 0, "DM:0001:er=0\x0aMasterPanpot=0\x0aMasterMute=0\x0aOutputMode=0\x0aTracks=1\x0aFeder0=0\x0aPanpot0=0\x0aMute0=0\x0aSolo0=0\x0a[EventList]\x0a0=ID#0000\x0a7680=ID"],
...
What is THIS? An encoded INI file? YES! And all the events of which the INI
file consists have a delta time of zero, and occur at the start of the file.
The INI file is split into text events of fixed length, and not e.g. into
lines. WTF?
['control_change', 0, 0, 99, 96],
['control_change', 0, 0, 98, 0],
['control_change', 0, 0, 6, 0],
['control_change', 0, 0, 38, 0],
['control_change', 0, 0, 98, 1],
['control_change', 0, 0, 6, 0],
['control_change', 0, 0, 38, 0],
['control_change', 0, 0, 98, 2],
['control_change', 0, 0, 6, 0],
['control_change', 0, 0, 99, 83],
['control_change', 0, 0, 98, 2],
['control_change', 0, 0, 6, 1],
['control_change', 5760, 0, 99, 96],
...
That's right. All there is is control changes. Everywhere. No notes, no lyrics,
no nothing. Now the format would be even more WTF if these events would somehow
encode the phonemes and the note pitches... but AT LEAST they don't do that.
But if you look closely, those control changes are a bit redundant... you see
multiple control changes for the same controls at the same time! But this is
just a General MIDI WTF (see Registered Parameters, "RPN"). Apparently, in
Vocaloid, 98 and 99 (in GM, 100 and 101) select a controller, and 6 and 38
write MSB and LSB into it (like in GM). But, these are just the controllers you
can set in the application, so this part can be considered sane. We however
know the notes and lyrics are NOT encoded with these!
['text_event', 186720, ''],
]
}),
····
]
});
And this part is normal again.
Ok... and what's in the INI file? Why, the notes of course! Along with timing
info!
Let's see:
[Common]
Version=DSB301
Name=Voice1
Color=181,162,123
DynamicsMode=1
PlayMode=1
[Master]
PreMeasure=4
[Mixer]
MasterFeder=0
MasterPanpot=0
MasterMute=0
OutputMode=0
Tracks=1
Feder0=0
Panpot0=0
Mute0=0
Solo0=0
Okay, a "Feder" is probably a Fader improperly spelled. Comma separated values
in INI are a bit odd, but for a color it's nothing too weird.
[EventList]
0=ID#0000
7680=ID#0001
7920=ID#0002
8040=ID#0003
8160=ID#0004
...
Apparently, these are a mapping from timestamp to event. This BTW means there
can only be one event at a given timestamp if this is to be a standard INI
file! Apparently, Vocaloid files always fulfill that, though. What is with this
ID# stuff?
[ID#0000]
Type=Singer
IconHandle=h#0000
...
Ah, so the ID# stuff is the section name to look in. Okay, this is an event,
somewhat like a MIDI patch change event. h#0000 is of course also the name of a
MIDI section, but we're going for the notes here.
[ID#0002]
Type=Anote
Length=120
Note#=64
Dynamics=64
PMBendDepth=8
PMBendLength=0
PMbPortamentoUse=0
DEMdecGainRate=50
DEMaccent=50
LyricHandle=h#0002
...
And this is how a note is stored. Both note-on and note-off in one, fine,
that's okay. It contains the note pitch, the length and all sorts of other
nice info - but not the lyrics. These - again - are in another castle, I
mean, INI section:
[h#0002]
L0="a","a",0.000000,0,0
...
[h#0005]
L0="chu","tS M",0.000000,64,0,0
WTF? WTF? WTF?
So the lyrics handle is just an indirection to a comma separated, with
quotation marks, encoded single INI value. Why aren't these five named values,
you ask? No idea!
The first entry are the lyrics, the second one the phonemes in a weird ASCII
encoding, and the rest are some parameters.
So to conclude: a VSQ file is a MIDI file with just timed controller events,
along with an embedded INI file at the beginning of the data track. The actual
song (notes, lyrics) is encoded in this INI file, and NOT the MIDI data...
So the final question is: what were they smoking?
And as for the "competition"... the UTAU format (UST) is a lot less WTFy. It is
just an INI file, no MIDI involved there. To summarize the UTAU WTFs quickly
without showing a file: each event is a lyric event, and has a length. There
are no delta times - the events all form a single string from start to end.
Overlaps are right out. How to do a rest with this? Simple! Just set the lyric
text to a single uppercase "R". The other WTF is that when multiple tracks are
contained, INI section names are repeated, but apparently UTAU users don't do
that anyway...