Foreign data wrappers and caching (PostgreSQL <=> Oracle)


  • BINNED

    I'm in the middle of planning for a project. One of the requirements is some minor data manipulation in their Oracle database. The database server is not on the same location as the server we're installing, so there's a chance that a connection between them might get lost due to raisins.

    They want some kind of backup solution if the link is temporarily down. We use Postgres for our application, so the most logical course of action is caching the data in a Postgres database if shit hits the fan.

    So, I'm looking into foreign data wrappers. I have used them once successfully (granted, between two Postgres databases, not Postgres and something else) so that seems to work fine. However, there are two questions I can't find a reliable answer for:

    • Is there an easy way to cache remote data locally if the link goes down? Specifically, the first part of the process is checking some user credentials. Caching the credentials even for extended periods of time is acceptable. If possible, I'd like Postgres to cache stuff and keep the application layer oblivious to any problems and just keep it chugging.
    • If there's some caching available, will the data sync up automatically once the link fixes itself or do I have to mess with it manually?

    Anyone have any experiences?



  • Use Oracle RAC? 🚎

    I don't think FDW are anything more than a Postgres view into whatever remote database. EnterpriseDB might have something in PgAdvance Server that helps but I can't recall anything like that off the top of my head.


  • BINNED

    Yeah, I figured. I asked anyway due to me being a forced "I get shit working eventually" database person. I really don't know all the intricacies, I learn as I go and fuck up a lot as I do, most likely.

    Well, some combination of cron scripts + triggers should do the job, I was just trying not to reinvent the wheel if possible.

    BTW, about the 🚎 part: I don't intend to touch anything Oracle on my side, and I don't intend to ask them to install anything extra. Even if serious, I'd reject that suggestion, it's just not worth the pain.



  • @Onyx said:

    Is there an easy way to cache remote data locally if the link goes down? Specifically, the first part of the process is checking some user credentials. Caching the credentials even for extended periods of time is acceptable. If possible, I'd like Postgres to cache stuff and keep the application layer oblivious to any problems and just keep it chugging.

    Use a materialized view or a local table + triggers

    I'd say that "transparent" caching by the FDW itself is a bit evil if it's done. Because it hides the implementation details and caveats from you, so when it fails, you suddenly realize you're not knee-deep but shoulder-deep in shit, because all of a sudden you need to figure out how exactly it works.


  • BINNED

    @wft said:

    Use a materialized view or a local table + triggers

    Yeah, that was the backup plan. I was just trying to be more lazy 😛

    @wft said:

    Because it hides the implementation details and caveats from you, so when it fails, you suddenly realize you're not knee-deep but shoulder-deep in shit, because all of a sudden you need to figure out how exactly it works.

    Eh, I'd assume it would be better than something that I wrote and less prone to failure. And as long as I have the data somewhere it wouldn't be that big of a deal in this specific case since if something did fail having it lag a bit until I fix it actually wouldn't be the end of the world. If it had to be realtime I'd be more concerned about it.



  • Doesn't matter, as long as you know more or less completely how it works. It's just that I've lost belief in good documentation a while ago.



  • I used to solve this by having the application write out the data in a quick-and-dirty text file (if the DB insertion failed), then it had a scheduled task (well, actually a C# Timer class) that every hour attempted to insert the text file records (if any) into the DB. Quick and dirty and WTF-y, but it worked pretty well.



  • My instinct would be to just have two distinct operations:

    1. Cache data in your local PostgreSQL (maybe a queue like structure)
    2. Execute writes in a remote OracleDB, emptying your local "queue" in a transaction

    Advantages:

    1. Simplified understandable code
    2. Reliable in case of remote failures
    3. No fancy native modules or other dubious solutions. Database is for data, not for a service layer

    Disadvantages:

    1. Need more coding to get it done
    2. If you need both reads and writes in the same operation, complexity goes through the roof.

  • BINNED

    @cartman82 said:

    My instinct would be to just have two distinct operations:

    1. Cache data in your local PostgreSQL (maybe a queue like structure)
    2. Execute writes in a remote OracleDB, emptying your local "queue" in a transaction

    Current plan, depending on the answer we get on the whole cache thing (it's not confirmed yet if they just want an "unavailable" message instead, but I wanted to have a plan either way):

    1. have a local cache of auth data required, updated nightly using a batch job just after their nightly batch jobs finish on the Oracle instance
    2. handle all the auth stuff and writing locally into Postgres
    3. when the call is done (yes, it's a phone thing) and there is no risk of something going wrong and crashing the active channel, kick off a script that
    4. checks for any un-synced data. This could either be all rows in a temp table, or rows containing a flag (in case we decide to keep a local cache longer)
    5. tries to synch data with Oracle. Failing that, leave it be and let the next sync attempt pick it up
    6. just in case, have a cron job attempt a final sync before their nightly kicks off, in case there's something still left in our DB alone

    @cartman82 said:

    If you need both reads and writes in the same operation, complexity goes through the roof.

    Yup, trying to avoid that, hence the above. The main reason I'd keep new entries in our DB longer is that, in case they get any bright ideas about reading off the old state to users at any point, I don't have to worry about that shit at all.


Log in to reply