It's Time To Get Over That Stored Procedure Aversion You Have



  • @Jaime said:

    If you use a sproc to keep the data in sync, then you have to concern yourself with validating every code path that interacts with the hourly data.

    It's easier than you think to police calling sprocs.

    Well, one of the reasons is that we have a few tasks that occur on every update of a table. So every table has a write/update/view set.



  • @xaade said:

    It's easier than you think to police calling sprocs.

    Is it easier than the zero effort it takes to police trigger usage? If not, then your statement is a form of "It's good because it doesn't hurt that much."



  • @RaceProUK said:

    Sometimes you need to to get the performance. It's shit, I know, but sometimes you have no other choice.

    I've never seen anybody who made that claim actually prove it via data.

    I'm not saying it's wrong. But it's definitely in my, "hmmmmmmmmm" file.


  • FoxDev

    Where I used to work (a vehicle telematics company), they would roll-up data in some form of cache, simply because there was so much plot data. Whether that's genuine denormalisation or not… I dunno. But what I do know is that, in order to make the reports/screens not take minutes to run, some data had to be rolled up.



  • @RaceProUK said:

    Where I used to work (a vehicle telematics company), they would roll-up data in some form of cache,

    I don't even know what that means. How do you "roll-up" data? What form of cache?

    https://www.youtube.com/watch?v=_P_hKcZgQhY


  • FoxDev

    Roll-up just means summarise really. And I'm using cache in a fairly loose sense.



  • @RaceProUK said:

    Roll-up just means summarise really. And I'm using cache in a fairly loose sense.

    Are you talking about OLAP? Because that's completely different (and has absolutely nothing to do with) denormalization.


  • Java Dev

    Similarly here, though we use an OLAP approach. Though not any of the oracle olap features.

    Database, pump in half a million rows a minute. That's the base cube. Derive from that aggregated data (subset of dimensions, truncated time). That's the data you end up viewing in the UI.

    This is all via backend job scheduling; there is no 'update aggregated data when the user updates the source' because the configuration only affects how new input data is processed.


  • FoxDev

    @blakeyrat said:

    Are you talking about OLAP?

    Maybe? I dunno. I was on the web side of things; someone else handled all the SQL magic.



  • Ok how about you figure out what the fuck you're talking about, then come back, ok?


  • Fake News

    @blakeyrat said:

    Ok how about you figure out what the fuck you're talking about, then come back, ok?

    💗


  • FoxDev

    @blakeyrat said:

    Ok how about you figure out what the fuck you're talking about, then come back, ok?

    Eh, it doesn't matter really; just being a silly girl ;)


  • I survived the hour long Uno hand

    If you're going to be like that you can shoo right back onto the other team >.> ;)


  • FoxDev

    …I'll be good 😄

    In all fairness, @blakeyrat's probably right; I probably was talking about something other than OLAP.



  • Should I?

    I feel dirty


  • FoxDev

    I've liked the occasional @blakeyrat post; for all his acidity, he does sometimes have a valid point 😆

    …and Discourse is telling me off for posting so much… but then, as of this post, I do have 21% of the posts in this thread, so it's kinda got a point 😄


  • FoxDev

    @RaceProUK said:

    That should be painted in dayglo pink on a black background in letters 18" high, and erected in every development office worldwide.

    We're gonna need a lot of paint…

    s/sproc/the database/g and we have a winner.

    has anyone heard of SQL Queues? i just learned about them because apparently core business logic relies on them and they stopped working all of a sudden.

    4 days of trying to figure out what the damn thing was even trying to do and it still wans't fixed. so i said belgium it and nuked the belgium-er, then implimented the behavior in the service layer where it damn well belonged.

    ten minutes later all was fixed and all was happy, except for the original dev of the code who still insisted that the queue needed to be fixed.


  • Java Dev

    I near-daily work with a software platform that is acctually 99% implemented in SQL. As in, there's a java shim because you cannot call stored procedures direct from the webserver.

    I reimplemented the overview page in PHP because that performs better and gives more information.


  • FoxDev

    @accalia said:

    except for the original dev of the code who still insisted that the queue needed to be fixed.

    I'd tell him to go Belgium himself ;)
    @PleegWat said:
    I reimplemented the overview page in PHP because that performs better

    😶
    Wow, the original code must have been a veritable trainwreck…


  • Java Dev

    I can't imagine building HTML pages in PL/SQL is fast. PHP at least was built for that job.


  • FoxDev

    @PleegWat said:

    PHP at least was built for that job.

    Well… OK, I'll concede that one ;)


  • kills Dumbledore

    @PleegWat said:

    building HTML pages in PL/SQL

    I've built very limited HTML snippets in T-SQL



  • @Jaloopa said:

    HTML snippets in T-SQL

    you should burn in hell. or at least spend a couple of years in the purgatory.


  • BINNED

    👋
    Join the club! One can do crazy stuff in SQL when the platform you're calling from is weaker then a VB6. It was a custom/visual flow designer. Breaking out with ODBC to MS SQL stored procs or plain queries was often done for 'trivial' stuff like date manipulation because that was hardly supported, e.g. unless you wanted to cut date strings manipulate them tediously and collate them again. Or you could just flunk in an ODBC connection to the existing application DB and fire of a query. And yes I did create HTML snippets, mostly text for e-mails, that way.



  • @Onyx said:

    I think the main complaint most people will sling at stored procedures is "Potrability!"

    But, honestly, how fucking often do you change database engines, and how sure are you that your ORM can deal with it?

    Seriously. DB portability is a problem that should be solved when it becomes a problem, because it's invariably going to be a massive undertaking for any non-trivial product.

    @Onyx said:

    The prime example is getting the ID of last inserted record. Every DB engine has (or doesn't have) it's own way of dealing with this. Does your ORM deal with them all? If not, how much pain will that cause? More or less than changing bits of specific syntax in SQL.

    I always use @@identity SCOPE_IDENTITY() OUTPUT clauses!

    @Onyx said:

    Not to mention query planners and their quirks. With stored procs or without them, are you sure you won't have to rewrite a single query you / your ORM generates?

    Of course not. And tuning is going to be highly specific to the load, the database engine, etc.



  • @Maciejasjmj said:

    Also, version controlling.

    Yeah, there's no getting around having to script out changes. If each proc is scripted into its own file, it's reasonably easy to manage, though.

    @Maciejasjmj said:

    Also also, debugging is a pain.

    How so? I've found it to be fairly simple, Do something like this:

    begin transaction
    
    select <validate precondition...>
    
    exec procname
    
    select <validate postcondition...>
    
    rollback
    

    And since you can compare entire datasets fairly easily, if you dump the output of Big Huge Critical Report A into tables, you can perform regression testing of old outputs vs. new by doing something like this:

    select * from #olddata
    except
    select * from #newdata
    
    select * from #newdata
    except
    select * from #olddata
    

    (If both of those return no results, the old matches the new).



  • @Eldelshell said:

    Why you can't ask your developer to be an SQL guru? Because it's not something they are 100% of their time doing. In general, SQL stuff is something you learn enough to accomplish a certain task and then forget about it. It's also why you don't want your DBA's to be doing any frontend development. They'll probably hack something in jQuery that'll look like an SQL EXECUTE and end up here.

    At the last two places I've worked, developers are expected to be competent both at front-end stuff and at SQL.



  • @boomzilla said:

    I use ORM extensively. Also SQL. I have some stored procedures and functions and even some backed by Java code.

    This sounds like a prudent mix of strategies.

    @boomzilla said:

    The main thing that stored procedures do for me is that they make my life more difficult from a configuration control / administrivia / procedure perspective. So there better be some major gain to adding them.

    How so? I've seen nothing but benefits from using them (and views/functions), most notably:

    • All paths to the database are known and documented, so risk assessment is much easier
    • Having an interface/abstraction layer allows changes to the underlying data model without changing the application
    • Compiled query plans
    • Improved security (provided the app isn't logged in as sysadmin)

    @boomzilla said:

    Exactly. And it's a lot easier to drop to SQL when you really need it and let the ORM handle the drudgery the rest of the time.

    I can agree with this, but I typically find myself using the ORM on views/functions/stored procedures just because having that layer of abstraction is helpful. If the ORM, for example, coughs on a proc that returns multiple result sets, then I too drop down to SQL.



  • @Jaime said:

    Trigger.

    This should always be at the bottom of the list, when there are no other alternatives.

    Triggers are magical things. They're so magical that people forget that they exist, and that's where the problems start.

    @Jaime said:

    If you use a sproc to keep the data in sync, then you have to concern yourself with validating every code path that interacts with the hourly data.

    @Jaime said:

    Is it easier than the zero effort it takes to police trigger usage?

    If you use a trigger to keep the data in sync, then you have to concern yourself with all the subtle ways that trigger (and ostensibly a few others) can fire. It also leads to some fun debugging sessions.



  • @Jaloopa said:

    I've built very limited HTML snippets in T-SQL

    I've seen code that composes HTML emails to be sent via sp_send_dbmail*. That's where I draw the line of insanity.

    *This snippet was inside an UPDATE trigger to email a big boss whenever a record in table XYZ changed.


  • FoxDev

    @Groaner said:

    This snippet was inside an UPDATE trigger to email a big boss whenever a record in table XYZ changed.

    WTF?

    what level of insane paranoia and mistrust pervaded if the big bos was getting emails from the middle of a production database?



  • @accalia said:

    WTF?

    what level of insane paranoia and mistrust pervaded if the big bos was getting emails from the middle of a production database?

    A big boss of one of our clients. The table in question contained product information for products they sold, and thus the information therein was updated fairly infrequently, but when it was updated, it was important enough that they wanted to know about it.


  • Banned

    The point is usually stuff like this

    users = User.where('age > 2')
    
    if descending 
       users = users.order('age desc')
    else
       users = users.order('age')
    end
    
    if admins_only
      users = users.where('admin')
    end
    

    You do tend to carry along MAJOR amounts of code to support this stuff 10s of thousands of lines, personally I almost always prefer the my SQL Builder pattern which fits into 100 lines of dependency.

    query = 'SELECT * FROM Users /**where**/ /**order**/'
    
    builder = SqlBuilder.new(query)
    
    if descending 
       builder.order('age desc')
    else
       builder.order('age')
    end
    
    if admins_only
      builder.where('admin')
    end
    
    

  • :belt_onion:

    Not quite that level of insanity, but at one place I worked we had a system where multiple stored procedures were called to build up a multi-tab report in HTML/XML that could be either displayed in the browser or wrapped up in a zip file and sent to the client as an Excel file.

    The zipping and downloading of the "Excel" file was handled by the application though, hence the lesser WTF.
    And yes, I think the real WTF is that Excel supports that
    And yet, I can't help but be the teensiest bit impressed



  • @svieira said:

    And yes, I think the real WTF is that Excel supports that

    Excel is remarkably tolerant of all the different formats you can throw at it. I've seen a lot of export utilities that spit out HTML and import flawlessly.


  • ♿ (Parody)

    @Groaner said:

    @Onyx said:
    I think the main complaint most people will sling at stored procedures is "Potrability!"
    But, honestly, how fucking often do you change database engines, and how sure are you that your ORM can deal with it?

    Seriously. DB portability is a problem that should be solved when it becomes a problem, because it's invariably going to be a massive undertaking for any non-trivial product.

    So, interestingly enough, we're actually right in the middle of this with ProGet. It currently runs on Windows/SQL Server, and we're working on getting it working in Linux / Postgres, too. Everything's done with sprocs.

    You know what would have been even more of a pain? Not using sprocs. SQL Server and Postgres are pretty close, but far enough away that using SPROCS as a database "API" keeps the database-specific stuff in the database.



  • You build applications to generate data. That data is your value, that value brings in the money. There is nothing else and if you think safeguards for this data belong anywhere else other than cuddled up next to your data with a machine gun and flamethrower you’re out of your fucking mind.
    Points for passion and style.


  • @Groaner said:

    At the last two places I've worked, developers are expected to be competent both at front-end stuff and at SQL.

    It depends on what you are up to. At my current job I am doing, on and off, a lot of SQL. No stored procedures though, because sqlite does not have them. It's an application that bakes some data and SQL is the right tool for big part of processing that needs to be done to it. And I mean queries that join seven tables, joins that join the same table three times with itself plus some other table, nested queries and such. It wouldn't be really possible to have separate person writing the SQL, because it is very much part of the logic.

    On the other hand if there is a bunch of disparate applications written by different people at different times that connect to the same database, there have to be separate database developers (database admins does not really sound appropriate any more) who do the SQL and provide simpleish interface with sprocs, views and triggers and the application developers don't need to understand most of it.



  • @Groaner said:

    ORM

    Years ago I was on a loan to a team doing some serious database stuff for government agency. They just took over maintenance of a key system from another company. The system was generated in some database design tool (not really ORM or just ORM; some huge enterprisey thing I don't really know anything about).

    The result was that when I needed to upload something to the database I had to use a xml-rpc service where for each record I first asked for a bloated chunk of XML, edited the XML, sent it back and then I had to handle retries when it changed meanwhile, because obviously I couldn't have normal transactions that way.

    And then the database guy showed me the database backend it generated. Everything was wrapped in layers and layers of cryptically named views (something in style of __psqw0021, __psqw0058 etc.) nested at least 10, 20 deep. Compared to the databases made by our team that were simple translation of the required model to database schema and uploading was usually done by inserting somewhere and calling an sproc to update related entries correspondingly.


  • FoxDev

    That sounds like a system that needs to die in a lake of fire...



  • I believe that was the intention, but it's pretty difficult when the system keeps rather important records about 10 million people and bunch of other systems rely on that data. For me it was only a temporary job though, so I don't know how it developed further.

    On a side-note, my task there was uploading data to various systems that came, usually offline, from various other agencies. That was done using a sparrows' nest of perl scripts. Since the formats were various text files with fixed-width fields, various half-baked forms of xml and such, it was actually reasonable tool for the job (the WTFs were they weren't using any version control and that when I installed subversion for them, they didn't understand it anway; plus the ad-hoc setup on the servers where each script was executed differently and used different method for reporting errors).


  • Discourse touched me in a no-no place

    @Bulb said:

    No stored procedures though, because sqlite does not have them.

    You can use a custom function instead, but SQLite really relies on the fact that it's running embedded…


  • Discourse touched me in a no-no place

    @Bulb said:

    a sparrows' nest of perl scripts

    That sounds like the correct collective noun to me. :D



  • The problem was that the resident Perl Popes did not really now it. The script I was to make had to poll input directory for new files and if one was found process it and move it to archive. So I wrote a module for checking for new files and moving them, put the actual actions in another module, tied it together and documented it so the file handling part could be reused. The two devs responsible for most of the perl scripts were not able to, because they simply had no clue how to write a "class" in perl at all.


  • ♿ (Parody)

    @Groaner said:

    How so? I've seen nothing but benefits from using them (and views/functions), most notably:

    I have an extra layer of review and bureaucracy and deployment. Now configuration control is more fragmented. If a query is inside my code, it's easy to see what's deployed with that code. Not nearly so easy to know what code is living inside a schema.

    Now I'm dealing with PL/SQL (though all such languages from all DBs looks about equally awful to me).

    @Groaner said:

    All paths to the database are known and documented, so risk assessment is much easier

    But anything inside the DB is more of a PITA to analyze, not to mention debug. Pass.

    @Groaner said:

    Having an interface/abstraction layer allows changes to the underlying data model without changing the application

    I can see why you might want this, and in some cases, I have this with my judicious use of them.

    @Groaner said:

    Compiled query plans

    This seems like the least of my issues, though I have used them in some circumstances when certain queries would go bonkers based on the parameters used over time. So, again, something to use for a particular purpose, not to drive me crazy for the occasional benefit.

    @Groaner said:

    Improved security (provided the app isn't logged in as sysadmin)

    If this is a benefit, then you have bigger problems. 😛



  • @Groaner said:

    All paths to the database are known and documented, so risk assessment is much easier

    No, all paths within the database are known. You still have the same challenge figuring out which applications call which procedure. Also, sp_depends is notoriously inaccurate since deferred resolution was introduced.
    @Groaner said:
    Having an interface/abstraction layer allows changes to the underlying data model without changing the application

    Yes. And this is just as true if that interface/abstraction layer is written in Java or C# as it would be if it is written in T-SQL. What is your argument for the T-SQL implementation being superior?
    @Groaner said:
    Compiled query plans

    Meh. Ad hoc plan caching has been around for a very long time, so this advantage is only true if you are using a 15 year old version of your chosen database platform. Ironically, I sometimes implement things as stored procedures specifically so I can mark those procedures as "don't cache". You can't do that with ad hoc batches.
    @Groaner said:
    Improved security (provided the app isn't logged in as sysadmin)

    The security issues don't change, they just move. Security should be at the business layer, not the data layer. That way, when security gets more sophisticated than "these people have read access to this entity", it doesn't turn all hairy. If you insist on implementing security at the data layer, you get the following effects:

    • Business logic starts to creep into the database since some security rules can only be expressed in terms of business processes.
    • Identity has to be pushed to the db layer. This is really bad for web apps and three-tier apps. It disables connection pooling. It makes password management harder. It forces SQL Server to use the less secure SQL Authentication.
    • It doesn't fix a damn thing. Somebody still needs to implement security properly. The entire list of what the app people could screw up is the same list that the DB people could screw up.
    The first two points lead to scalabilty problems.

  • Discourse touched me in a no-no place

    @Jaime said:

    It doesn't fix a damn thing.

    QFT (and what were we talking about anyway?)



  • @Groaner said:

    This should always be at the bottom of the list, when there are no other alternatives.

    Triggers are magical things. They're so magical that people forget that they exist, and that's where the problems start.


    This only matters if you put business logic in a trigger. You are supposed to forget that triggers exist. A good analogous technology is access logging for a web server. It happens for every call to the web server, but you almost never have to think about it.
    @Groaner said:
    If you use a trigger to keep the data in sync, then you have to concern yourself with all the subtle ways that trigger (and ostensibly a few others) can fire. It also leads to some fun debugging sessions.

    If you use triggers for the tasks they were designed for (denormalization and auditing), it is really simple to thoroughly test them. Then they are like any other background task - you ignore them when debugging other stuff.

    BTW, there are no subtle ways to fire a trigger. A trigger has an FOR clause - it happens then and no other time.



  • @Jaime said:

    The reason a trigger is a better answer than a sproc is that a trigger makes it so you can rely on that denormalized data.

    I feel the need to point out that triggers are just a subclass of sprocs. At least, that's how SQL Server handles them.



  • @abarker said:

    I feel the need to point out that triggers are just a subclass of sprocs. At least, that's how SQL Server handles them.

    This thread is about how the next layer up interacts with the database. How SQL Server internally handles triggers is irrelevant in this context.


Log in to reply