The DB can't keep up



  • We have a back-end routine that runs numerous queries in parallel to speed up accessing all the data required for a given transaction.

    Each spawned query looks like this:

       void doQuery() { // pseudo code, try/catch/finally stripped out
    String sql = "select ... ";
    PreparedStatement ps = con.prepareStatement(sql);
    ResultSet rs = ps.executeQuery();
    while (rs.next()) {
    // grab and store data
    }
    // close rs, ps, return con to the pool
    try { Thread.sleep(1125); } catch (InterruptedException ie) { }
    }

    I then discovered that the number of threads in the ExecutorService is way less (5) than the number of queries (about 30) that need to be run in parallel. Why? Because the DB can't keep up with all those queries! Also, we need to sleep for a bit after the query to give the db a chance to free resources.

    Wha...?

    It turns out that there are four instances of the application that query this service, and the DB simply can't handle 120 parallel queries. Mind you this is a fairly huge Oracle RAC system with many GB of ram and 128 cores. However, the DBAs only have 512MB of scratch space allocated. For all users. For the entire DB. When you're querying tables with 3+B rows, and need subqueries, etc., that scratch space isn't going to get you very far.

    When I suggested that they increase the amount of allocated scratch space, the DBA responded that 512MB was enough for the development environment so it should be enough for the production environment, and that we should change our application to accommodate the limited scratch space - which is what the original developers did. (A comment in the code might have been helpful).

    My boss can't do anything about it because the DBAs live in their own world and aren't accountable to anyone.




  • AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAARGH!

    Just what the heck is scratch space anyway? I've heard of undo logs, table space, SGA, archive logging space and probably a few more, but I've never heard of scratch space.



  •  scratch space == temp space

     



  • @snoofle said:

    the DBA responded that 512MB was enough for the development environment so it should be enough for the production environment
    So, after you explained to him that it's obviously not the case and that he needs to add more, he still refused?  Did you ask him why they limit it to 512MB, other than the broken logic of "if it works fine where there are six concurrent users and 200k rows of data, it should be fine for 300 concurrent users with billions of records"?  And I mean a real reason, not because he got too fat this week to be able to reach his keyboard...

    You mentioned that they're not accountable to anybody?  Who are the stakeholders, clients, etc.?  What I'm asking is, it eventually comes down to money or someones budget.  Get those people involved and say that the bottle-neck isn't caused by application code.  Instead you have the BOFH reborn as a DBA giving you a hard time.



  • @C-Octothorpe: We did talk to the stake holders. I even drew them pretty pictures so they could physically see what I was talking about (basically what you described only visually). 

    Unfortunately, they are the ones who hired the DBAs as their technical experts, and their technical experts would never mislead them; it must be us developers who don't know how to write code properly.

    I learned long ago not to argue with idiots. If they want to spend a $LOT on new servers rather than just fix configurations, who am I to argue.

    I make recommendations once. If they listen, I'll work with them. If not, it's their nickel.



  • @erikal said:

    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAARGH!

    Just what the heck is scratch space anyway? I've heard of undo logs, table space, SGA, archive logging space and probably a few more, but I've never heard of scratch space.

    it's that big thing you sit on that just seems to itch at the wrong times so you just have to scratch it and hope no one notices.  That my friend is your scratch space.


  • BINNED

    @snoofle said:

    @C-Octothorpe: We did talk to the stake holders. I even drew them pretty pictures so they could physically see what I was talking about (basically what you described only visually). 

    Unfortunately, they are the ones who hired the DBAs as their technical experts, and their technical experts would never mislead them; it must be us developers who don't know how to write code properly.

    I learned long ago not to argue with idiots. If they want to spend a $LOT on new servers rather than just fix configurations, who am I to argue.

    I make recommendations once. If they listen, I'll work with them. If not, it's their nickel.

    The only problem with that is they'll need someone to blame when the application is still slow even after spending megabucks on new servers, since a sufficiently incompetent DBA can cause infinite resource usage. Since they're in denial about their DBAs being criminally incompetent, who does that leave?



  • So how many awards have these DBAs received for just being awesome? After all they are the experts in their field and know everything there is to know about databases and scalability and resource allocations, and production vs. development environments, etc.

    What kind of bozos are these that fail at basic understanding of the difference in load between development and production? While there are many services where the load in development will be very similar to the load in production, there are also many services (i.e. anything which is externally-facing, or even internally-facing if you have a large organization and the service is to be used by a large portion of the organization as part of their regular duties) where the load in production is many times (orders of magnitude?) bigger than in development.
    Maybe you need to simulate more users in development (automated user processes?) to emulate the production load and start complaining about needing more resources for both (and maybe even win extra resources in the process because of the extra resources required to run the automated users)?



  • @PedanticCurmudgeon: agreed. However, since I've already managed to get a 90+% improvement in performance (this was over and above that), I doubt anyone's going to point the finger at me.  And if they do, I can prove it's the db. And if they still won't listen, I get forced to go out and get a better job, and they get to live with their DBAs.

    I'm not going to lose any sleep over it.

     

     


  • ♿ (Parody)

    @snoofle said:

    However, since I've already managed to get a 90+% improvement in performance (this was over and above that), I doubt anyone's going to point the finger at me.

    It's amazing what you can find when you start looking at what's going on vs what needs to be going on. Though an alternative approach would be to figure out a way to make the 512MB scratch space is insufficient in your development environment, too.



  • @boomzilla said:

    @snoofle said:
    However, since I've already managed to get a 90+% improvement in performance (this was over and above that), I doubt anyone's going to point the finger at me.
    It's amazing what you can find when you start looking at what's going on vs what needs to be going on. Though an alternative approach would be to figure out a way to make the 512MB scratch space is insufficient in your development environment, too.
    True enough.  I'm assuming you would have run some load/stress tests in dev before pushing to production, so wouldn't the original devs have hit the 512MB ceiling fairly early on?



  • @C-Octothorpe said:

    @boomzilla said:

    @snoofle said:
    However, since I've already managed to get a 90+% improvement in performance (this was over and above that), I doubt anyone's going to point the finger at me.
    It's amazing what you can find when you start looking at what's going on vs what needs to be going on. Though an alternative approach would be to figure out a way to make the 512MB scratch space is insufficient in your development environment, too.
    True enough.  I'm assuming you would have run some load/stress tests in dev before pushing to production, so wouldn't the original devs have hit the 512MB ceiling fairly early on?

    I did run some stress tests of my own, and blew the temp space limits. The DBAs argued that it's always been this amount, since the department was formed. I countered that there were only 3 developers back then and there are over 50 now, and it's likely that with 17 times the folks doing 17 times the work we were likely to need significantly more resources. No, make do with what is there!

    So instead of:

    update theTable set colName = value where ...

    I have a procedure that updates 10K rows at a time, committing after each iteration. It takes forever, but nobody cares.

    If I care, I get upset. If I don't care, I'm free. Free. Free!

    Guess what I chose.

     



  • @snoofle said:

    @C-Octothorpe: We did talk to the stake holders. I even drew them pretty pictures so they could physically see what I was talking about (basically what you described only visually). 

    Unfortunately, they are the ones who hired the DBAs as their technical experts, and their technical experts would never mislead them; it must be us developers who don't know how to write code properly.

    How about calling a Capacity Planning meeting?

    Get stakeholders and a few DBA "experts" and - given a number of concurrent connections and estimated queries per user/customer - you request the DBAs calculate capacity requirements for the temp tablespace as well as limitations placed upon AUTOEXTEND?

    Simply throw the problem at them and either watch them squirm when they can't produce figures (for what is a simple DBA task) or come up with some figure then explain to the other stakeholders the effects of exceeding that DBA-imposed limitation, and how no amount of application-level programming will get around the issue of capacity-related incidents arising due to insufficient allocation at the DBA end.

    For bonus points, you could also advise the stakeholders to get a "quicker browser" since the bottleneck does not appear to be at the DB end...



  • @Cassidy said:

    @snoofle said:

    @C-Octothorpe: We did talk to the stake holders. I even drew them pretty pictures so they could physically see what I was talking about (basically what you described only visually). 

    Unfortunately, they are the ones who hired the DBAs as their technical experts, and their technical experts would never mislead them; it must be us developers who don't know how to write code properly.

    How about calling a Capacity Planning meeting?

    Get stakeholders and a few DBA "experts" and - given a number of concurrent connections and estimated queries per user/customer - you request the DBAs calculate capacity requirements for the temp tablespace as well as limitations placed upon AUTOEXTEND?

    Simply throw the problem at them and either watch them squirm when they can't produce figures (for what is a simple DBA task) or come up with some figure then explain to the other stakeholders the effects of exceeding that DBA-imposed limitation, and how no amount of application-level programming will get around the issue of capacity-related incidents arising due to insufficient allocation at the DBA end.

    For bonus points, you could also advise the stakeholders to get a "quicker browser" since the bottleneck does not appear to be at the DB end...

    I completely agree with you, however it seems more and more obvious to me that he is faced with a stone-cold wall of ignorance.  They don't care, they don't want to care, and no amount of reasoning will make them care.  It's really up to the people who hired them to rein them in, but they sound like a bunch of spineless fools who never question the "experts"...  I am gobsmacked at the apathy and feebleness his management continually displays and hostility from other teams, who incidentally all work AT THE SAME FUCKING COMPANY.



  • @C-Octothorpe said:

    I completely agree with you, however it seems more and more obvious to me that he is faced with a stone-cold wall of ignorance.  They don't care, they don't want to care, and no amount of reasoning will make them care.  It's really up to the people who hired them to rein them in, but they sound like a bunch of spineless fools who never question the "experts"...  I am gobsmacked at the apathy and feebleness his management continually displays and hostility from other teams, who incidentally all work AT THE SAME FUCKING COMPANY.

    Welcome to Enterprise, we've got fun and games..



  • @morbiuswilters said:

    @C-Octothorpe said:
    I completely agree with you, however it seems more and more obvious to me that he is faced with a stone-cold wall of ignorance.  They don't care, they don't want to care, and no amount of reasoning will make them care.  It's really up to the people who hired them to rein them in, but they sound like a bunch of spineless fools who never question the "experts"...  I am gobsmacked at the apathy and feebleness his management continually displays and hostility from other teams, who incidentally all work AT THE SAME FUCKING COMPANY.
    Welcome to Enterprise, we've got fun and games..
    Oh yeah, I'm actually leaving a very large, multinational bank soon.  Not because of the crippling red tape, layers upon layers of management and in-fighting though...

    I certainly won't miss those things either...



  • @morbiuswilters said:

    Welcome to Enterprise, we've got fun and games..

    Yay! I love Star Trek!



  • @snoofle said:

    @PedanticCurmudgeon: agreed. However, since I've already managed to get a 90+% improvement in performance (this was over and above that), I doubt anyone's going to point the finger at me.  And if they do, I can prove it's the db. And if they still won't listen, I get forced to go out and get a better job, and they get to live with their DBAs.

    I'm not going to lose any sleep over it.

     

    You're a better man than I am, snoofle din.

     



  • @KattMan said:

    @erikal said:

    Just what the heck is scratch space anyway? I've heard of undo logs, table space, SGA, archive logging space and probably a few more, but I've never heard of scratch space.

    it's that big thing you sit on that just seems to itch at the wrong times so you just have to scratch it and hope no one notices.  That my friend is your scratch space.

     

    Ah, yes I feel stupid for not realizing that earlier. *scratch scratch*.

     



  • Is there any way to create a task for the DBA team that would force them to use the 512 MB of scratch space?  If they are forced to suffer from their own incompetence they might be inclined to change it.  The only other idea I have is go the path of "Sometimes it is easier to ask for forgiveness rather than permission" and change the scratch space yourself (assuming of course you just happen know to the system login for the database).



  • @zelmak said:

    @morbiuswilters said:

    Welcome to Enterprise, we've got fun and games..

    Yay! I love Star Trek!

    Fuck Star Trek, I keep it jungle.


  • I can't help but think that a web service (or some other single point of entry) and caching could go far in helping this situation.



  • @Anketam said:

    change the scratch space yourself (assuming of course you just happen know to the system login for the database).
     

    Given the incompetence of these DBAs already demonstrated, I'd give even odds on "Scott / tiger" doing the trick.



  • @KattMan said:

    @erikal said:
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAARGH!

    Just what the heck is scratch space anyway? I've heard of undo logs, table space, SGA, archive logging space and probably a few more, but I've never heard of scratch space.

    it's that big thing you sit on that just seems to itch at the wrong times so you just have to scratch it and hope no one notices.  That my friend is your scratch space.

    I thought it was that spot between your shoulders that itches and you can never *quite* reach it right...

     



  • The only other idea I had for fixing it, which I will admit is bad, is to allow a huge issue to get into production that scares all the executives to death and they start pushing we need this fixed ASAP.  You then tell the leadership that you can fix the production issue, but you will need a blank check level permission to do whatever is needed to fix the issue no questions asked.  At that point you now have the power to screw the DBAs and fix not only the production issue, but this issue while trying to fix the bigger issue.  Once you change it you then claim it is too risky to change it back and attribute it to that the huge issue might come back.  At which point executive paranoia and fear will do the rest.

    "Fear will keep the local systems in line. Fear of this battle station." ―Grand Moff



  • @Anketam said:

    Is there any way to create a task for the DBA team that would force them to use the 512 MB of temporary tablespace?

    Going pedantic dickweed on your arse for a moment, but if this is what snoofle meant by "scratch space" then Oracle permits several temporary tablespaces, so different users can use different ones (i.e. someone that runs big queries for large reports can use a separate temp TS so they don't impact on normal BAU users.

    In snoofs' case, I would create another temp TS and assign this to the application account, comparing transaction speeds before and after, reporting this impact in graphical format to the stakeholders (ooh! pretty pictures!). However, I know that snoofle doesn't have SYSDBA-level access to the testbed, let alone production, so he's SOL.



  • @C-Octothorpe said:

    @Cassidy said:

    @snoofle said:

    @C-Octothorpe: We did talk to the stake holders...

    How about calling a Capacity Planning meeting?

    ... he is faced with a stone-cold wall of ignorance.  They don't care, they don't want to care, and no amount of reasoning will make them care.  It's really up to the people who hired them to rein them in, but they are a bunch of spineless fools who never question the "experts"... 

    The users don't do capacity planning here. It's usually: Let's just start off with what we have, and if it grows, we'll buy hardware later. No amount of logic will work against this wall of stupidity.

    Instead, I am befriending some of the auditors of our new enterprise overlords. I figure that if I let it slip that there is this gaping hole between reality and what is needed, and no plans in place, that THEY will force the issue; they have the authority and power to mandate it to the C**'s who can make change.

     

     



  • @snoofle said:

    @C-Octothorpe said:

    @Cassidy said:

    @snoofle said:

    @C-Octothorpe: We did talk to the stake holders...

    How about calling a Capacity Planning meeting?

    ... he is faced with a stone-cold wall of ignorance.  They don't care, they don't want to care, and no amount of reasoning will make them care.  It's really up to the people who hired them to rein them in, but they are a bunch of spineless fools who never question the "experts"... 

    The users don't do capacity planning here. It's usually: Let's just start off with what we have, and if it grows, we'll buy hardware later. No amount of logic will work against this wall of stupidity.

    Instead, I am befriending some of the auditors of our new enterprise overlords. I figure that if I let it slip that there is this gaping hole between reality and what is needed, and no plans in place, that THEY will force the issue; they have the authority and power to mandate it to the C**'s who can make change.

    I would do such a thing only to maintain my sanity, a steady income and make day-to-day activities easier for me and much less (read none) for the good of the company...  Does that make me selfish or just a realist/pragmatist?



  • @Mason Wheeler said:

    @KattMan said:

    @erikal said:
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAARGH!

    Just what the heck is scratch space anyway? I've heard of undo logs, table space, SGA, archive logging space and probably a few more, but I've never heard of scratch space.

    it's that big thing you sit on that just seems to itch at the wrong times so you just have to scratch it and hope no one notices.  That my friend is your scratch space.

    I thought it was that spot between your shoulders that itches and you can never *quite* reach it right...

     

     

    No, KattMan is right.  Go back and look at the OP, who starts right off saying this is for a "back-end routine".

     



  • @snoofle said:

    The users don't do capacity planning here.

    yeah... that much is obvious from some of your previous posts. Doesn't stop you from introducing the concept as a concern. You'll probably get the "Stop this FUD, you scaremongering git!" response form many, but if that meeting is logged and audited, you'll go from pessamistic emo to informed soothsayer.

    @snoofle said:

    It's usually: Let's just start off with what we have, and if it grows, we'll buy hardware later. No amount of logic will work against this wall of stupidity.

    I suppose an alternative approach is to passive-aggressively follow their daft policy: fill in purchase orders for more H/W rather than optimise the bits n bytes running on the existing metal, then question why things aren't much faster.

    Either way... keep us posted!



  • @Cassidy said:

    @snoofle said:

    The users don't do capacity planning here.

    yeah... that much is obvious from some of your previous posts. Doesn't stop you from introducing the concept as a concern. You'll probably get the "Stop this FUD, you scaremongering git!" response form many, but if that meeting is logged and audited, you'll go from pessamistic emo to saboteur.

    FTFY.  At least, based on the mentality it sounds like snoofle has to deal with.



  • @Anketam said:

    "Fear will keep the local systems in line. Fear of this battle station." ―Grand Moff

    The force is wrong with this one...


Log in to reply