FileNotFoundStorage.com Design Document



  • RE: Some thread somewhere that I can't bother going to find due to discourse suck search, I bought FileNotFoundStorage.com as a joke ($1.99 for a year, how could I not?)

    So I figured I'd get input on how to make a seemingly non-wtf storage platform, but as you peel back the layers, it becomes a glorious WTF.

    Note: The goal is to have a WTF design, but be perfect, technically speaking (IE: No sql injects, xss, malware distro, etc.) - This is will be a fully functional (to the design specs) joke site, and a bit of a science project if anybody would ever use the service (target market would be the same type of people who rave about aliens, only about NSA monitoring)

    So here's what I've come up with so far:

    There must be a beautiful welcome splash screen that explains how to upload a file to the server

    • Select 'Browse'
    • Select a file
    • Click 'Upload'
    • Make sure the welcome screen includes clear directions, and that there's no actual way to retrieve the file once uploaded (though post the no retrieval message after the upload button - upload button will send data async so the user has the potential to read the message)
    • Provide both a short URL and an obfuscated long url to the original file name.
    • Create a queue if the server is overloaded, advise user their place in the queue

    At this point the server will assign an encryption key for the upload and

    • Rename the file (on the server side)
    • Encrypt the file using the new generated key
    • Not provide the file name back to the user (for security)
    • Or the encryption key (for security)
    • Or a link to the file (for security)

    Provide a file search feature

    • Make it look like a search feature
    • (debate) Maybe make it look like a valid file?
    • Allow for download of dummy file that contains no data/corrupted data

    Out of scope / not a feature of the service:

    • No file retrieval in any way. It's a one way dump
    • No way to verify the file actually exists on the server (for security)

    Feel free to get creative. It can be as silly as you want so long as it does not cause security issues (because the service is for security!...) You should consider things along the lines of legitimate use should be high performance, but naughty things (like trying to retrieve a file) should
    have consequences, such as the site taking o(n^2) seconds to load on each failed attempt.

    Preferably features should be segmented in such a way that they can be implemented over a weekend



  • Semi-serious question: are we actually talking about storing the files? Or merely the pretence of storing files?

    If file retrieval isn't a feature, is actually storing the file in the first place a feature? Or just providing all the trappings (like assigning an id as if the upload had taken place)?



  • I will actually store the files, which will in fact be encrypted using the key generated. It will be explicitly clear that it's a one way street for uploads.



  • Example of a feature that should be included (for security!)

    Periodically re-encrypt all files (or otherwise modify them) so the last modified date / create date cannot imply a user who might have uploaded the file



  • If you're doing write-only, why bother actually doing the write?



  • Just doesn't feel authentic if I don't.

    There will be a file search feature, it just won't be able to find any of the files. I haven't yet decided if I want to keep a count of how many files were uploaded for the user.


  • Discourse touched me in a no-no place

    Search only on the encrypted text and then only return the encrypted filename. You don't keep what they decrypt to because that would “compromise security”. :squirrel:

    I've heard of encryption algorithms that still allow searching on the content (using an appropriate encryption of the search term). Do not use them.



  • @Matches said:

    FileNotFoundStorage.com
    I think the domain name should be "WriteOnlyMemory.com"!



  • Feel free to pick up the domain and point the name servers at my hosting.

    And I like it, dkf.



  • Maybe have files wandering in and out, and between users?



  • @Matches said:

    Allow for download of dummy file that contains no data/corrupted data

    Delivering "downloads" as a valid file header followed by an unending stream of random data with exponential pauses between data packets is always a good one.

    I'll see if I can dig out my "security" font end. It was inspired by something my bank use, just made more infuriating. Nearly as infuriating as Discourse, even.

    Ah, there ya go. I'd shoved 'em up onto github. https://github.com/tufty/scambaiting



  • @dkf said:

    Search only on the encrypted text and then only return the encrypted filename. You don't keep what they decrypt to because that would “compromise security”.

    I was thinking along the same track - implement a full text search for the files and give the user the filename and last change date, but not the file.

    Since the files are regularly re-encrypted, this is a great feature - it works per spec ("User can search in files", "filenames of files with matching text are displayed to user"), while at the same time being completely useless since searching for "ca2cd0f00123b7d986ec8e0726585fb9" might yield a file or two on 2014-07-14, but not after the next re-encryption.

    For added bite make the search results bookmarkable (with a table <search_text> | <file_name> | <modification_date> displaying the bookmarks) so that you have a "history" that is also completely useless after the next re-encryption.



  • @DaveK said:

    I think the domain name should be "WriteOnlyMemory.com"!

    It's not as if write only memory would be a new idea.



  • I'm on to you, @Matches:

    Props for choosing PHP though, for maximum WTFery.


    Filed under: It's definitely you, don't try to deny it.



  • Oh goody, PHP? That means I can "help" with proceedings. This is perhaps not a good idea.


  • BINNED



  • @Arantor said:

    Oh goody, PHP? That means I can "help" with proceedings. This is perhaps not a good idea.

    Why not? I thought the explicit goal was to create wtfs, not to avoid them?


  • Discourse touched me in a no-no place

    It needs to break the scrollbar, all errors need to be handled by showing a page which simply says "You're doing it wrong" and any features it does have need to be mostly undiscoverable.



  • @loopback0 said:

    any features it does have need to be mostly undiscoverable

    Ooh yes, we need this. There need to be important pages that you can only get to by modifying the URL.



  • That's why "help" was in quotes. I can make WTFs with the best of them. As I keep reminding people, I am TRWTF, and as such being a PHP master I can make things that people are awed and inspired by - and still be WTFs on so many levels.



  • @Keith said:

    Ooh yes, we need this. There need to be important pages that you can only get to by modifying the URL.

    Hmmm... since the filenames are encrypted and are changed randomly every now and then... why not expose the files themselves? Like "http://filenotfoundstorage.com/4400da57a7672bd70dc840db8b862339.pdf"?



  • Bonus points if the file name encryption is poorly implemented and causes frequent collisions, causing all but one of the matching files to be eaten.



  • MD5 FTW!



  • @Arantor said:

    MD5 FTW!

    Hmm, it depends whether the encryption of the name needs to be reversible, which I assume it does or we'd be talking about hashing. Also, MD5 won't offer a high enough chance of collision with the 12 or so people that might end up using this application... I like your thinking though.


  • BINNED

    Sounds like it needs a non-random seed.



  • @Keith said:

    Hmm, it depends whether the encryption of the name needs to be reversible,

    Reading the spec again, it doesn't look like being able to decrypt a filename is a requirement.

    @Keith said:

    Also, MD5 won't offer a high enough chance of collision with the 12 or so people that might end up using this application... I like your thinking though.

    Agreed. Good idea, but MD5 is far too secure.

    We should roll our own encryption here anyway, I'd say. I propose the following:

    • base64 for starters
    • then converting the result into morse
    • interpret the resulting bit-stream as 8-bit ascii
    • rot-13 that ascii
    • and print ot out in hex.
    • put print-out on a wooden table
    • photograph wooden table with printout with analog camera
    • scan negative
    • store scanned picture in NoSQL db.


  • @Luhmann said:

    non-random seed


  • BINNED

    @faoileag said:

    We should roll our own encryption here anyway, I'd say. I propose the following:

    That scheme seems like something that could handle Unicode.

    We can't have that. Maybe an extra step of conversion to pure 7-bit ASCII first?



  • Good idea! 7-bit ASCII will make it platform agnostic!



  • Unless you're using EBCDIC.



  • I've had a flash of blinding inspiration, which could help you to monetise this product.

    When users upload a file, you don't encrypt the file contents, you hash them and store the hashed value in the database. Your storage requirements immediately reduce from possibly megabytes per file to, say, 40 characters per file plus the name.

    When a user requests a specific file, the system adds it to a queue of pending requests and lets the user know which position their request is in.

    A background process grabs one request at a time from the queue and:

    • Starts to build a file from scratch.
    • Hashes the file and compares the hash to the one stored in the database.
    • Uses some comparison function to generate a similarity value between the created and original hashes.
    • Repeats until the similarity index is above a cut off value.

    If the similarity index is higher than a specific value, the system outputs the file to the user, with whichever file extension it originally had.

    WTF advantages:

    • The retrieval will take ages.
    • The returned file will be nothing like the original file.
    • Contents will be effectively irretrievable (security!)
    • Storage requirements are tiny.

    Monetisation opportunities:

    You can provide several levels of account, where the basic, free account will get no queue prioritisation, will receive a file with a 50% similarity index or better and will build files completely randomly.

    More advanced accounts will provide:

    • Priority in the pending request queue.
    • A higher required similarity index cut off.
    • 'Advanced' file building functions that use 'special' algorithms instead of the random building provided by the free account.

    Mix all of the above in various ways to make an almost incomprehensible membership structure.

    The only feature which would actually provide an advantage is the queue priority, which will enable you to get your incorrect file more quickly.


  • BINNED

    @Keith said:

    which will enable you to get your incorrect file more quickly.

    +1 because it is not possible to indicate how much you like a certain sentence above liking the reply


  • Discourse touched me in a no-no place

    Brillant. This is starting to sound like an entry for an OMGWTF competition.



  • @Keith said:

    The only feature which would actually provide an advantage is the queue priority

    What about adding file sharing as a second fully working feature? Idea: users on the premium plan can distribute files by providing short links to the content: www.f.com/251a1c/.

    And it would be totally leggit and secure - not even the RIAA could object against it!



  • @faoileag said:

    Idea: users on the premium plan can distribute files by providing short links to the content: www.f.com/251a1c/.

    Yes, file sharing is definitely something it should do. If the short links are the first few characters of the hash, we can take advantage of the potential for file name collision.


  • Discourse touched me in a no-no place

    @faoileag said:

    Good idea! 7-bit ASCII will make it platform agnostic!

    So I'd be forced to download ??????.pdf instead of ??????.pdf? How's that going to help security?



  • @dkf said:

    So I'd be forced to download ??????.pdf instead of ??????.pdf? How's that going to help security?

    "Security through obscurity". A design pattern that helps you write programs that are hard to maintain and easy to hack.



  • Sounds ideal.

    Only question, is @Nagesh helping to build it?



  • @Nagesh can write an Enterprise Java Bean to communicate with filenotfoundstorage.com using the API.

    API... There's another area ripe for abuse...



  • @Keith said:

    @Nagesh can write an Enterprise Java Bean to communicate with filenotfoundstorage.com using the API.

    Ah, the "Braillant @Nagesh Bean"!


  • Discourse touched me in a no-no place

    @Keith said:

    API... There's another area ripe for abuse...

    Do all communication by sending PDFs back and forth via RESTful calls. If that doesn't scare hackers away, switch to some other grossly unsuitable fun format for that sort of thing



  • I vote SOAP instead of RESTful. And naturally, XML.


  • Discourse touched me in a no-no place

    By the time you're using PDF instead of XML, moving it via SOAP instead of REST won't make for a hill of beans of difference. It's forcing the clients to generate PDF to send requests to the server that elevates the wooden table count…



  • This is enterprise class, yes? Then it needs XML.

    ...in the PDF.



  • @Arantor said:

    This is enterprise class, yes? Then it needs XML.

    I agree with @Arantor here - we need XML for data transport. I mean, without XML how could we validate incoming data with regexes?


  • Discourse touched me in a no-no place

    PDFs allow attachments. Hardly any PDF tools handle them other than the awful Acrobat things from Adobe.



  • No, no, don't do that. Just accept them as they are and let the underlying libxml do that for you.

    (Especially as, incidentally, that opens you up to a vulnerability before you start.)


  • BINNED

    @faoileag said:

    A design pattern that helps you write programs that are hard to maintain and easy to hack.

    So, use GoTos a lot so we can just add a noodle and jam it?


    Filed under: I know that's not what you meant, Still a valid suggestion


  • Discourse touched me in a no-no place

    @Arantor said:

    No, no, don't do that. Just accept them as they are and let the underlying libxml do that for you.

    You could always move the data via references done with XML general entities. If that doesn't scare you, you're blissfully unaware. You lucky bastard.



  • That's precisely what I mean about introducing a vulnerability, since DTDs and thus entity references will be automatically loaded for you by libxml, even if they're DTDs on a third party site that are thus currently unknown.


Log in to reply