Is there a standardized format for forum exports?



  • Possibly with copious amounts of XML and 15 subtly incompatible implementations?



  • Nope. Besides, I don't think there's a markup format that would be able to handle half of the broken HTML and shit that this community dumped into CS without dying in the ass.



  • Like across different forum software? Hell no. But the general structure of forum databases is similar enough that you can map some of the data pretty easily.


  • Banned

    No, but every copy of Discourse includes native backup/restore.

    We haven't had too much trouble mapping other forum software to Discourse in imports. We'll be open sourcing that work soon, we just need some more imports under our belts.



  • Not that I'm an expert but I have seen a number of warez (Wordpress and Joomla spring to mind) that support phpbb imports so I guess that's as close to a "standard" as you're ever going to get.
    I suspect, though, that it will still be a lowest-common-denominator situation though


  • :belt_onion:

    There are import utilities for phpBB in the phpBB repository as well, I seem to recall, that work pretty well for most platforms.

    I had a migration project from XMB 1.9 to phpBB3 (about 140k posts, 3k members) a couple years ago. Had to write my own utility as the existing import utility was outdated. Thought about the project for a month or so, wrote the utility over a couple days, tested with the forum administrators for a couple weeks, made sure everything looked roughly the same theme-wise on the new forum before migrating. Even made sure everyone who was logged in to the old forum would be logged in to the new forum automagically. Having worked on application integrations for a while, I believe change should be as transparent as possible.

    I think the only negative, or even neutral, feedback on the new platform was that XMB's "Today's Posts" feature was missing. I opined that it wasn't that necessary since a list of unread topics wouldn't repeat topics just because they'd been posted within the last twenty-four hours, but after a bit of discussion, I installed a mod to do it anyway and everyone was happy. I didn't want to impress my ideology on the members, and I find myself using it constantly still. It's all about muscle memory and bookmarked URLs. Otherwise the feedback was overwhelmingly possible.

    I'm not sure if this transition to Discourse was considered with the same amount of care. </understatement> In any case, my point is that a standardized format for forum exports would actually be really cool, but there's always going to be manual work to do depending on how long a forum's been in operation. I found myself whipping up scanning utilities and doing semi-automatic corrections a couple weeks later for encoding issues caused by XMB storing posts in ISO-8859-1 and phpBB3 storing posts in UTF-8. There are always edge cases and special functionality that one platform or the other may not support and it's madness to try to support them all. Just look at the Open XML formats.


  • Discourse touched me in a no-no place

    @heterodox said:

    a standardized format for forum exports would actually be really cool

    But any format that would actually be capable of doing it (probably a horrendous XML or JSON variant) would be fucked up by the implementers, with the fuck-up usually centred around encoding problems. Because the people smart enough to understand just how incredibly careful you have to be with encodings usually tend to be also too smart to get suckered into writing forum software. (Encodings: look easy, aren't easy at all.)

    Filed under: YABITC (Yet Another Bad Ideas Thread Contribution)…



  • @dkf said:

    (probably a horrendous XML or JSON variant)

    How about some combination of XML, JSON, base64, and "the data will be encoded in the same way as Word 95."


  • Discourse touched me in a no-no place

    @ben_lubar said:

    How about some combination of XML, JSON, base64, and "the data will be encoded in the same way as Word 95."

    And then emailed round as a PDF attachment.



  • @dkf said:

    And then emailed round as a PDF attachment.

    Wait a minute; where's the wooden table? The wooden table is missing.


  • BINNED

    @HardwareGeek said:

    Wait a minute; where's the wooden table? The wooden table is missing.

    Open the page on a tablet (preferably using a shitty app that's just a mutilated browser hardcoded to open a single URL), take a picture of it on a wooden table, print the picture, and then scan it into a Word 95 document using whatever crappy OCR software came with your cheap-o scanner.

    That should cover it all I think.



  • Then run it through an ASCII art converter and have someone telegraph it too.


  • Discourse touched me in a no-no place

    Needs more avian data carriers. Or maybe lots of string so they can carry the wooden table.


Log in to reply