Effects of file conversion on stegonography and trojans



  • I was wondering what effects converting a file (avi/mpg/mkv to ogm, or mp3 to ogg, or jpeg/gif to png)* would have on trojans and stegonography techniques. To my understanding file conversion should kill any embedded or hidden files, but I am unsure. Also could this be applied to be a fairly effective anti-virus scheme (just convert incoming files to some random other file type)?



    *My normal conversion flow, I just gave examples, these are of course non-exclusive.



  • It is fairly simple to design things to penetrate ideas like that. It wouldn't be any more effective than signature-matching, and it would probably be more work.



  • Steganography can survive conversions. It depends on how the message is hidden and how lossy the conversion is.

    Trojans, well, I wonder how they would hide in a movie or image file anyway. If those files are somehow rigged to cause e.g. a buffer overflow in some player, so malicious code gets executed, chances are that the conversion program will crash or even execute the malicious code itself. It the trojan is just some kind of passive payload, you might get rid of it by the conversion, but it would probably cause no harm anyway unless a third program extracts and executes the code.
     



  • @Lingerance said:

    I was wondering what effects converting a file (avi/mpg/mkv to ogm, or mp3 to ogg, or jpeg/gif to png)* would have on trojans and stegonography techniques. To my understanding file conversion should kill any embedded or hidden files, but I am unsure. Also could this be applied to be a fairly effective anti-virus scheme (just convert incoming files to some random other file type)?

    *My normal conversion flow, I just gave examples, these are of course non-exclusive.

    Steganography in images basically hides your message in the low-order bits of the pixels. People won't notice the difference between RGB(128, 128, 127) and RGB(128,128,128), but if it's the right pixel, that 1 bit difference is part of the message. A poor implementation of this will store 1 bit of the message in one pixel of the target image. Therefore any kind of resizing would kill that message outright. A more robust implementation would store that bit in multiple places, or in entire blocks of the image, which makes it likely that a simple resize would preserve at least part of the image.

    A straight JPG->PNG conversion (with the JPG having the hidden message) wouldn't lose any of the message, as PNG is a lossless format. A roundtrip of JPG->PNG->JPG, however, would at least damage the message, if not outright kill it.

    Of course, steganography is useless in most situations. You have to assume that whatever app you used to embed the message is also available to anyone who might be snooping your data traffic. Not only do you have to bury the message in the noise of an image, you have to step back another level and bury the image itself in the noise of a huge mass of other images.

    @ammoQ said:

    Trojans, well, I wonder how they would hide in a movie or image file anyway. 

    Most of the newer media formats have provisions for arbitrary data blocks. EXIF information in JPGs is one such example - jpg's support a "comment" field, and EXIF is just a specific encoding of data in that field. AVIs have provisions for embedding auto-run URLs which can pop up a copy of IE the moment you play the video or hit a specific frame.  Once IE's up and pointed at scumsite.com, it's game over.

    And of course, the inevitable buffer overflows are there too. I think Real Player's king of suffering from them, but m$ Media Player isn't far behind. MP and Winamp both suffered from holes in their skinning engine.



  • I'm unsure how an arbitrary-length payload, such as an mp3, could cause any sort of buffer overflow. If I play a virus, I just get a load of static and noise and bleeps. Doesn't matter how big it is, I'll just get more bleeps.



  • @dhromed said:

    I'm unsure how an arbitrary-length payload, such as an mp3, could cause any sort of buffer overflow. If I play a virus, I just get a load of static and noise and bleeps. Doesn't matter how big it is, I'll just get more bleeps.

    Overly complex file formats designed by people who weren't thinking properly. Not all bitstreams are valid mp3 audio, or jpeg images, or mpeg video, or whatever. Somewhere in the gap lies bugs. 



  • @dhromed said:

    I'm unsure how an arbitrary-length payload, such as an mp3, could cause any sort of buffer overflow. If I play a virus, I just get a load of static and noise and bleeps. Doesn't matter how big it is, I'll just get more bleeps.

    The MP3 format's pretty nicely specified. The official ISO document is a few hundred pages, but there isn't really anything that can be specified in an arbitrary manner. Mostly it's just a matter of figuring out where the actual MP3 data starts in the file, from which you can start tracking through the frames. There's no index/catalog/table listing all the frames. You scan through for potential start-of-frame signatures, parse a couple fixed-size/length headers, one of which one specifies the bitrate of the frame, from which you can extrapolate the most likely offset for the next frame (mp3 frames are a constant 0.028 seconds in time-length, or thereabouts), etc... This is how id3 is even possible. The MP3 player doesn't need to be aware of ID3 data so it can be skipped, it just needs to be able to locate actual MP3 frames and play them. In theory you could randomly sprinkle mp3 frames within a gigabyte of randomly numbers and you'd still get music out of Winamp.

    The exploitable bits come from other stuff that gloms onto mp3 files, like id3v2 tags. Version 1 was an absolutely fixed 255 bytes at the end of the file. But v2 is basically an arbitrary content database, some of which can be embedded data, like .jpg images and whatnot (think cover art). A badly written parser for id3v2 CAN suffer from buffer overflows if it naively trusts the incoming id3 data.

    Can't speak for .avi and .wmv files, but you have to remember that .avi is just a container for arbitrary audio and video data. The real overflows come from the codecs used to handle that data, not generally the avi parsers themselves.


  • @MarcB said:

    @dhromed said:

    I'm unsure how an arbitrary-length payload, such as an mp3, could cause any sort of buffer overflow. If I play a virus, I just get a load of static and noise and bleeps. Doesn't matter how big it is, I'll just get more bleeps.

    The MP3 format's pretty nicely specified. The official ISO document is a few hundred pages, but there isn't really anything that can be specified in an arbitrary manner. Mostly it's just a matter of figuring out where the actual MP3 data starts in the file, from which you can start tracking through the frames. There's no index/catalog/table listing all the frames. You scan through for potential start-of-frame signatures, parse a couple fixed-size/length headers, one of which one specifies the bitrate of the frame, from which you can extrapolate the most likely offset for the next frame (mp3 frames are a constant 0.028 seconds in time-length, or thereabouts), etc... This is how id3 is even possible. The MP3 player doesn't need to be aware of ID3 data so it can be skipped, it just needs to be able to locate actual MP3 frames and play them. In theory you could randomly sprinkle mp3 frames within a gigabyte of randomly numbers and you'd still get music out of Winamp.

    The exploitable bits come from other stuff that gloms onto mp3 files, like id3v2 tags. Version 1 was an absolutely fixed 255 bytes at the end of the file. But v2 is basically an arbitrary content database, some of which can be embedded data, like .jpg images and whatnot (think cover art). A badly written parser for id3v2 CAN suffer from buffer overflows if it naively trusts the incoming id3 data.

    Can't speak for .avi and .wmv files, but you have to remember that .avi is just a container for arbitrary audio and video data. The real overflows come from the codecs used to handle that data, not generally the avi parsers themselves.

    Clearer now, thanks. 



  • @MarcB said:

    Steganography in images basically hides your message in the low-order bits of the pixels.

     One stenography technique is to do that....



  • @phx said:

    One stenography technique is to do that....
     

    Using shorthand to hide messages in low-order bits? 



  • My point was that theres more than one way to hide information in images/movies/sounds that will survive conversion. Hell my old Neuros mp3 player could pick an audio fingerprint out of a crappy mic recording - thats around 128 bit of data from audio thats been absolutely fucked over.



  • @dhromed said:

    I'm unsure how an arbitrary-length payload, such as an mp3, could cause any sort of buffer overflow. If I play a virus, I just get a load of static and noise and bleeps. Doesn't matter how big it is, I'll just get more bleeps.

    Very simple example: Let's say the file format specifies that there's a field X of variable length, which is specified in an int. However, for realistic cases the length is always relatively small (e.g. in an image format, the width in pixels never exceeds 100.000). Now a sloppy programmer of a parser might statically allocate a buffer that's "plenty big enough" and then copy from the file in a loop counting up to the specified length. Et voila - buffer overrun.


Log in to reply