Discourse to NNTP gateway



  • I've started working on a Discourse to NNTP gateway.

    I'm writing this thing in Perl, because that's what I know best and since it's the glue language of the Internet, it must be good enough to glue Discourse to a news client 😄

    I'll upload the code to github later, right now it's functional to the point that I can connect to Discourse.
    The NNTP frontend is not even there yet.

    For extra WTF value, it doesn't implement its own way of communicating with Discourse, but rather uses one of the API's created by other people on this forum to talk to Discourse.
    All these API's are written in other languages, which just adds to the fun.

    These are the API's that I'd like to support as backends; more might be added:

    The end goal is to have an NNTP-server running locally that presents a Discourse instance as a news server with proper groups, threading, ...
    Most forum features can be mapped to equivalent functionality for NNTP (sometimes after applying a bit of "creativity").

    I'll be posting my progress here.
     


    Posted in meta, because I can't post to Programmer's Testing yet. Will move this topic once I've got the required group membership


  • @OffByOne said:

    Posted in meta, because I can't post to Programmer's Testing yet. Will move this topic once I've got the required group membership

    Paging @PJH.

    A NNTP gateway written in Perl and using C#/Node/Python for a back-end API. OMGWTF3 special award for losing the calendar, apparently.

    That being said, I was thinking of making an IRC server on top of Discourse...


  • Discourse touched me in a no-no place

    I read this as "NTP gateway" and thought it was a bad idea relying on Discourse for anything time related.

    Then I re-read it, and it made more sense. 😃


  • FoxDev

    i need to finish and lock down Sockbot core... i have the new version about 60% complete so should be released sometime this week.

    after that... no more rewrites! i swear!


  • FoxDev

    i read it that way at first too... i almost wish it were that. it would be way WTFier.



  • @Maciejasjmj said:

    Paging @PJH.

    I already PMmed him about it earlier today.

    @Maciejasjmj said:

    A NNTP gateway written in Perl and using C#/Node/Python for a back-end API. OMGWTF3 special award for losing the calendar, apparently.

    I want to do this first of all to see if I can. On top of that, I think that using a newsclient to access Discourse could actually be nice... Especially if you have multiple Discourse forums to follow.

    @Maciejasjmj said:

    That being said, I was thinking of making an IRC server on top of Discourse...

    Discoure IRC is that topic (although it has slowed down somewhat lately).



  • @accalia said:

    i need to finish and lock down Sockbot core... i have the new version about 60% complete so should be released sometime this week.

    Cool, I'll be following up on that 😄



  • @loopback0 said:

    I read this as "NTP gateway" and thought it was a bad idea relying on Discourse for anything time related.

    Sync your clocks to DiscoTime(tm)!

    I wonder what other techs could we abuse with Discourse.



  • @Maciejasjmj said:

    I wonder what other techs could we abuse with Discourse.

    IMAP! 😃


  • Discourse touched me in a no-no place

    @Maciejasjmj said:

    I wonder what other techs could we abuse with Discourse.

    FTP



  • @OffByOne said:

    IMAP!

    Too newfangled. Good old SMTP/POP3 server with Discourse as back-end storage is my bet.


    Filed under: IPoverDiscourse


  • FoxDev

    i'm basically rewriting it based on discoursisntency and botcore to get extra functionality it has been missing and remove some of the more glaring and unmaintainable WTFs, but in order to do that the API needed to change a bit. after this change i'll document and lock the API.



  • @accalia said:

    i'm basically rewriting it based on discoursisntency and botcore to get extra functionality it has been missing and remove some of the more glaring and unmaintainable WTFs, but in order to do that the API needed to change a bit. after this change i'll document and lock the API.

    Gorgeous! Do you plan on providing a way to access the API programatically? For my little project, I need a way to communicate with Discourse (is that Disco-mmutinate?) in a completely non-interactive way.

    Separate question: does anyone know how to obtain a list of categories (and their resp. subcategories) from Discourse?


  • FoxDev

    well it is just a node.js module. there's nothing saying you have to run the full bot via sockbot.js

    :-D


  • Discourse touched me in a no-no place

    @OffByOne said:

    Separate question: does anyone know how to obtain a list of categories (and their resp. subcategories) from Discourse?

    This might have it: http://what.thedailywtf.com/categories.json



  • Sure :) I mean: are the "talking-to-Discourse" parts and the I-am-a-bot-and-must-respond-to-summons-BEEP-BEEP parts separated in the code?

    I'm not interested in the "intelligence" of your bots, I just need a conduit to Discourse.

    I've read over your code a few times, but my mental model of it is not yet complete :) I've spent a bit too much time in Perl, C#, Python and trying to call one from the other today ;)



  • @loopback0 said:

    This might have it: http://what.thedailywtf.com/categories.json

    Looks usable, thanks!

    Is the "featured users" thing a recent addition? I don't recall seeing specific users being put in the spotlights.


  • FoxDev

    abso-flipping-lutely!

    the summon code is in sock_modules/summon.js
    the dice rolling code is in sock_modules/dice.js
    the anonymizer code is in sock_modules/anonymize.js

    the parts you will be interested in are configuration.js, discourse.js, and message_bus.js (replaces current notifications.js)

    and you might not even be too interested in the messge bus.

    now the core files do assume they are dealing with sock_modules for handling the stuff that happens on the message bus but the module interface is super simple to implement or you can skip message_bus all together and just use configuration and discourse (TODO: figure out how to decouple configuration and discourse before next release)


  • Discourse touched me in a no-no place

    No idea. It tends to pass around a lot more information than the pages actually use, so it's probably been there a while.



  • @accalia said:

    the parts you will be interested in are configuration.js, discourse.js, and message_bus.js (replaces current notifications.js)

    and you might not even be too interested in the messge bus.

    What does the message bus do exactly? I've seen all you bot developers talking about it, but I don't really grasp its function.

    @accalia said:

    now the core files do assume they are dealing with sock_modules for handling the stuff that happens on the message bus but the module interface is super simple to implement or you can skip message_bus all together and just use configuration and discourse

    I can mock a DiscoNews sock_module in Perl and use that to interface with your core files, so that's not really a problem :)
    If that proves to be too difficult, I can implement a DiscoNews sock_module in node.js and use that as the connection point for my Perl code.
    That's how I handle the BotCore backend: I derive a DiscoNews class in Python, interface with that class in Perl and instantiate an object and call from Perl to Python to get Disco-events.



  • @loopback0 said:

    No idea. It tends to pass around a lot more information than the pages actually use, so it's probably been there a while.

    Found it. It's a list of all users that are featured (most recent posters) in the topics.

    ETA: it passes around a lot more information than the pages actually use, but also a lot less information than desirable. For categories with subcategories, it tells you the ID's of the subcategories, but it doesn't list them. Argh.


  • Discourse touched me in a no-no place

    @OffByOne said:

    ETA: it passes around a lot more information than the pages actually use, but also a lot less information than desirable. For categories with subcategories, it tells you the ID's of the subcategories, but it doesn't list them. Argh.

    And there's no obvious way of looking up a category based on ID I've been able to find.


  • FoxDev

    message_bus does the hey you have new stuff happening stuff that discourse does. it's what we bots consume to stay so up to date on /t/1000 for example.



  • @loopback0 said:

    And there's no obvious way of looking up a category based on ID I've been able to find.

    I agree, not obvious at all. But:

    // models/category.js
    // class method
    
      reloadById: function(id) {
        return Discourse.ajax("/c/" + id + "/show.json").then(function (result) {
          return Discourse.Category.create(result.category);
        });
      }
    

    (seriously how would anyone guess it is /c/_id_/show.json)

    EDIT: HEY EVERYBODY IT SHOWS THE ACLs AT THAT URL [1]



  • wtf

      def available_groups
        Group.order(:name).pluck(:name) - group_permissions.map{|g| g[:group_name]}
      end
    

    this is literally useless except to admins and admins can calculate that.

    wtf.

    what's the point of a SPA if you keep sending extra data


  • Discourse touched me in a no-no place

    @riking said:

    I agree, not obvious at all. But:

    Aha - I'd tried /c/id.json and /category/id.json but not that.

    @riking said:

    (seriously how would anyone guess it is /c/id/show.json)

    Quite. Discoursistency.


  • Discourse touched me in a no-no place

    @Maciejasjmj said:

    I wonder what other techs could we abuse with Discourse.

    SNMP.

    @loopback0 said:

    And there's no obvious way of looking up a category based on ID I've been able to find.

    Comprehensive list:

    discourse=# select id, name from categories order by id asc; 
     id |          name          
    ----+------------------------
      1 | uncategorized
      3 | Meta
      4 | Staff
      7 | Rubbish
      8 | Article
     10 | Side Bar WTF
     13 | General
     14 | Coding Help
     16 | Bug
     17 | The I-Hate-Oracle Club
     18 | CodeSOD
     19 | FAQs
     20 | Funny stuff
     21 | Coder Challenge
     23 | The Lounge
     24 | Turn Left
     25 | Error'd
     26 | One Post
     28 | Programmers' Testing
     29 | TBD
     30 | Bot Testing
    (21 rows)
    
    Time: 0.855 ms
    

    @riking said:

    EDIT: HEY EVERYBODY IT SHOWS THE ACLs AT THAT URL [1]

    [pjh@sofa discourse]$ grep permission_type ./app/models/category_group.rb -B10 -A10
    class CategoryGroup < ActiveRecord::Base
      belongs_to :category
      belongs_to :group
    
      def self.permission_types
        @permission_types ||= Enum.new(:full, :create_post, :readonly)
      end
    [snip]
    

    Maps to:



  • @riking said:

    EDIT: HEY EVERYBODY IT SHOWS THE ACLs AT THAT URL

    It can also be used to retrieve information about private categories, which may not be intended behaviour...

    Filed under: Why does Private topics related to us hosting ubuntu even means?


  • Discourse touched me in a no-no place

    @PJH said:

    SNMP.

    XMPP and LDAP would be interesting candidates too. Because /t/1000 needs them!



  • Also check /site.json for site info and if you want to get a site's name and logo urls the problem is that, while every page has an embedded script tag where site settings get stored into the preloadstore, the actual site settings endpoint is admin-only.



  • @Buddy said:

    while every page has an embedded script tag where site settings get stored into the preloadstore, the actual site settings endpoint is admin-only.

    Mhm. I think I tried to fix that once...

    I suppose you could just parse the HTML :3



  • @riking said:

    I suppose you could just parse the HTML :3

    The bad ideas thread is over ▼ ◄ ▲ ► ▼ ◄ ▲ ► ▼ ◄ ▲ ► ▼ ◄ ▲ ► ▼ ◄ ▲ ▼ ◄ ▲ ► ▼ ◄ ▲ ► ▼Bag of Doritos™ brand chips ▲ ► ► ▼ ◄▼ ◄ ▲ ► ▼ ◄ ▲▼ there.



  • Well, if you wanted to make a client that could work with any version of discourse, you'd have to...



  • is there a nacho cheese/cool ranch poll?



  • protip for implementers: You want to listen on the /asset-change channel. If a message comes through that is NOT a SHA-1 hash (e.g. the tuple (global, clobber)), then a site setting changed and you should re-load them.

    If a message comes through that is a SHA-1 hash - e.g. (global, be439abef26d53d5e68474aecdc0d916) - then the site updated.



  • Success!!

    I've got a primitive NNTP interface to Dicsource working. Stuff that doesn't work yet:

    • Posting
    • The message bodies look like crap (I need to give them a proper Content-Type, they are rendered as text right now)
    • The topic→newsgroup name mapping could use some more thinking
    • The thing isn't Unicode-tested at all
    • Rate limiting: I didn't get errors for retrieving too fast yet, but I should code some limiting
    • Retrieving the posts in big topics gives HTTPError: 414 Client Error: Request-URI Too Large. I'll need to spread the queries over multiple smaller requests
    • Need to reformat Dicsource provided dates to dates that conform to the format expected by newsclients

    Currently I'm only using @mott555's BotCore as a backend. I'd like to add @accalia's sockbot and @Maciejasjmj's Discoursistency too, as well as write a pure Perl interface to Dicsource.

    Have a screenshot of Pan showing the Likes thread:


  • FoxDev

    i'm almost ready to post improved code... needs more testing.



  • @OffByOne said:

    - The thing isn't Unicode-tested at all

    Oh boy...

    @OffByOne said:

    Retrieving the posts in big topics gives HTTPError: 414 Client Error: Request-URI Too Large. I'll need to spread the queries over multiple smaller requests

    Yeah, that's a lot of fun. For the reference, post_ids[]=nnnnnn is 17 characters long, so 100 posts per request puts you on the rather safe side (I don't remember all the configs, but I think 2000 characters is the lowest limit I've encountered).

    @OffByOne said:

    - Rate limiting: I didn't get errors for retrieving too fast yet, but I should code some limiting

    You can check mine somewhere in the HTTP project (it's a burst limiter, but it shouldn't be too hard to adapt it to once-per-n-ms limiting).



  • @Maciejasjmj said:

    @OffByOne said:
    The thing isn't Unicode-tested at all

    Oh boy...

    It's not as bad as it sounds. You can't see it in my screenshot, but there are some posts by users who have a Unicode RTL character in their long names. It gets handled well by the client, actually.

    I need to do some conversions to be RFC compliant though.

    @Maciejasjmj said:

    @OffByOne said:
    HTTPError: 414 Client Error: Request-URI Too Large.

    100 posts per request puts you on the rather safe side.

    Thanks. I was going to figure it out by trial and error, but you telling me is a lot easier 😄

    @Maciejasjmj said:

    @OffByOne said:
    - Rate limiting: I didn't get errors for retrieving too fast yet, but I should code some limiting

    You can check mine somewhere in the HTTP project (it's a burst limiter, but it shouldn't be too hard to adapt it to once-per-n-ms limiting).

    I use 1 function to perform the actual GET requests. Keeping a timestamp of when the last request has happened and making sure I wait at least X ms before making the next request is rather trivial.

    There are some other things I want to implement too. Proper threading, generating a text/plain version to stick in the post body (so people who use slrn have a readable version), auto-fetching images in posts and wrangling the post body so they show up correctly, add styles to the HTML, so posts don't look like crap, ...

    When I'm satisfied with my NNTP gateway, I wouldn't mind implementing an IMAP/SMTP frontend too ;)
    That would make Dicsource on mobile at least fun to use.


  • FoxDev

    @Maciejasjmj said:

    Yeah, that's a lot of fun. For the reference, post_ids[]=nnnnnn is 17 characters long, so 100 posts per request puts you on the rather safe side (I don't remember all the configs, but I think 2000 characters is the lowest limit I've encountered).

    Sockbot uses 200 post chunks. havent run across issues yet.



  • @accalia said:

    Sockbot uses 200 post chunks. havent run across issues yet.

    Thanks. I'm testing with 100 post chunks now. It works, but is rather slow ;)


  • BINNED

    @OffByOne said:

    There are some other things I want to implement too. Proper threading...

    Doing It Very Wrong!

    If they weren't as efficient at deleting our posts over on meta.d I'd suggest spamming it with screenshots of a proper tree structure once it's working. The way our discussions here work, Jeff's hair would probably stand on it's end if he saw it.

    Actually, just realized something... any ideas on how you'll handle thread splitting and/or crosslinks? I didn't use NNTP much so I don't know if there's anything fancy that can be done...



  • @Onyx said:

    Doing It Very Wrong!

    If they weren't as efficient at deleting our posts over on meta.d I'd suggest spamming it with screenshots of a proper tree structure once it's working. The way our discussions here work, Jeff's hair would probably stand on it's end if he saw it.

    Frankly my dear, I don't give a damn...

    I started the NNTP gateway just to see if I could do it (and a little bit to spite Jeff). Now that I've been playing with it, I find it a way of interfacing with Discourse that's actually quite nice.

    @Onyx said:

    Actually, just realized something... any ideas on how you'll handle thread splitting and/or crosslinks? I didn't use NNTP much so I don't know if there's anything fancy that can be done...

    Cross-posting (posting the same message to multiple newsgroups) is very well handled by NNTP 😄

    There are some things that don't map as naturally, for example private messages or starting a new topic. I'll think of a way as I go, I guess.
    I'm also open to suggestions.

    I'll throw the thing on github when I've done some more testing and have been able to clean up the code a bit.



  • @Onyx said:

    If they weren't as efficient at deleting our posts over on meta.d I'd suggest spamming it with screenshots of a proper tree structure once it's working.

    I have it somewhat working. There are still some issues with correct Reference header generation (those are what mail and usenet clients use to determine how the message tree is connected), but the basics are there.

    Screenshot of Pan, your newsclient may look nicer of course ;)


  • Discourse touched me in a no-no place

    @Onyx said:

    The way our discussions here work, Jeff's hair would probably stand on it's end if he saw it.

    Oh, please do that! That dipshit is the one who said normal people don't NNTP.



  • +1 for showing the effects caused by RLO


  • BINNED

    I just now noticed that you're not showing only the long name, but the title badge too in the sender info.

    +<insert large number here> for that one



  • @Onyx said:

    I just now noticed that you're not showing only the long name, but the title badge too in the sender info.

    You triggered that: at first I only showed the long name, but names like yours only make sense together with the title badge 😄

    I'm thinking whether to wrap the badge name in [] to make it stand out that it's not really part of the name, but I'm not sure whether it would be an improvement.



  • I've fixed threading 😃
    Hate @ Dicsource for allowing a post to be in reply to itself (causing an infinite loop when walking the "parent" posts).

    Another screenshot (from Thunderbird this time, because it shows threading lines instead of just indentation):


  • BINNED

    I am now seriously considering using this, if nothing else on mobile. Is there a decent reader on Android?

    Well, at least until I get my "secret project" rolling... Muahahahaha!


Log in to reply