How many WTFs per hour is that?



  • I'm working on maintaining a piece of software that somehow survived for 15 years and was even sold, and more than that, it has production deployments which are working as of today (to clarify, I haven't written it, I was hired to maintain, bugfix, and extend it). To protect the guilty, I won't go into detail on what is that software precisely about. It would suffice to say that this software is the enterprisey type for planning some work, tracking the work being done and money to be paid for that.

    Of course, such software is always a steady source of fun for everyone trying to debug it, but this software beats them all, hands down.

    First, it is written in Perl. Not a WTF per se, of course, but the folks apparently were banned from reading Perl Best Practices and similar literature.

    Second, also not a WTF in itself, but in 2013 it still uses plain old school CGI. Never heard of Plack, and not even FastCGI.

    Third, it uses MySQL database. Not only that, but MyISAM all over. Not only that, but natural keys all over the place, so you have 111-ish tables, each using varchar(40) as primary key in the best case, compound with a few more varchar and int column in the average case. EXPLAIN shows keys with length of 192 on a normal day.

    Not only that, but the Perl code doesn't trust the database to do, well, the database work. It does all aggregation by itself, giving wonderful inconsistencies customers are tired of complaining about. That's the fourth one.

    Fifth: a corollary to the previous one. MySQL has DECIMAL type used for all kinds of arithmetic where some degree of precision must be guaranteed, like anything that touches monetary data. Perl's scalars do not have such feature. Thus, all aggregation done at Perl's side, storing the results being as DECIMAL kind of loses its whole point.

    Sixth: not using surrogate keys shifts the task of creating unique keys to the Perl side too. The application, in turn, does no more but select max(myidxfield) and increment it. Race condition? Never heard of that.

    Seventh: it uses a god object, which is the dreaded project structure, for everything. To do that, the structure is being read from the database every bloody time. Like,

    select * from projects where project_id=xxxxxxx;
    select * from project_sections where project_id = xxxxxxx;
    (For each section)
    select * from project_items where project_id=xxxxxxx and section_id=yyyyy;
    (Recursively)
    select * from project_sections where project_id=xxxxxxx and parent_section_id=yyyyy;
    (Of course, no one gives a crap all the sections are already there from the first query)
    

    Repeat for all 30+ dependent tables. Each time. Even if all you need is just change an item's description.

    Eighth. The software has some crappy database abstraction which everything else in it is using, and which hides SQL from plain sight (constructs it itself), but of course, it cannot do joins, or aggregates, or whatever else you might think about.

    Ninth. Despite the software being written for Apache, Perl, and MySQL, it was written on Windows and initial deployments were done there. As it uses XSL-FO to generate some PDFs, it thus uses some really screwed up code to call Java, which had to be fixed to be of any use on Linux. The XSLT was written so badly that only a version of fop from 2009 manages to process it, and the newer ones belly up. The copy of fop is bundled with the software, just in case.

    Tenth. And of course, the payment data is calculated differently, depending on whether you want the sum of all work done, or the sum of work performed in a given month. While the result is mathematically correct, due to the fact that the rounding needs to be performed - no one wants numbers like 459872.7898 dollars on their invoice - the final result is wonderfully inconsistent with partial results. That is, money(spring) is off by a few cents from money(March) + money(April) + money(May). And the most wonderful thing about this is that those numbers are presented on a single page. The customer sees something wrong, reaches for his copy of Excel, checks, and is, like, WTF guys are you doing?

    This is the sort of bug I'm tasked with fixing, or else. Wish me luck.



  •  Good luck with that steaming pile of swinery.

     But "or else" sounds ominous. Did they threaten you?



  • @shimon said:

    This is the sort of bug I'm tasked with fixing, or else...

    ...the company looses its main source of income and goes bankrupt, I assume?



  • Exactly. The most ironic bit about it is that keeping the company afloat is where my current best interests lie, so it's of no use to me just to go and tell them to let it all just sink.



  • Why are so many good WTFs posted on Sundays?



  • Just found another gem, right in the routine where you create a payment for the work performed.

    The invoice, as many of you happen to know, has items on it, each item has its own unit of measure, quantity and price per unit.

    So there's a routine that makes totals. For some yet unknown reason, it also makes a sum of all quantities (there go tons, cubic meters, man-months, liters, pieces) and the unit prices.





  • Purify it with fire.



  • @shimon said:

    First, it is written in Perl. Not a WTF per se, of course, but the folks apparently were banned from reading Perl Best Practices and similar literature.

    Reminds me of a project I briefly worked on years ago. (Wow, it's some 7 years and I remember reading about perl 6 development about that time. And there is still mostly nothing of it.) There was a bunch of perl scripts for feeding various data into a database and when I used a few simple objects in some of them, the full-time perl developers there said it's too hard for them to understand. (Fortunately the database was designed reasonably and didn't trust anybody, so the scripts usually just inserted the data to interface table and called a stored procedure. And ran with permissions so that they couldn't do anything else.)
    @shimon said:
    Fifth: a corollary to the previous one. MySQL has DECIMAL type used for all kinds of arithmetic where some degree of precision must be guaranteed, like anything that touches monetary data. Perl's scalars do not have such feature.

    It requires some serious perl-fu, but it should be possible to write a perl class that would do the calculations in whole cents internally. It would probably exist on CPAN, but if not it shouldn't be that difficult (a lot can be copy-pasted from Math::BigInt and bigint pragma). That should let you work around the most notable issue (custom numbers are slower, but it will probably get lost in all the other unnecessary work it does) and give you some breathing room to start untangling the mess.



  • @Bulb said:

    custom numbers are slower

    Using tied scalars, I presume?

    (How good Python has Decimal module out of the box.)

    @Bulb said:
    it should be possible to write a perl class that would do the calculations in whole cents internally.

    Unfortunately, that's not so. You have quantities which are measured in different units. Some are measured in tons, but need to be precise within a kilogram. So you have to measure in millicents internally. Some measurements are precise to the 4th digit, some might be even more arbitrary. In other words, you cannot win here totally.

    In some other department, it is expected that payments are all consistent. So while the software accumulates each item, it must take into account the rounding error from all previous payments (when it shows total amount paid). It's normal, because the last payment will have a correction for that (a few cents worth), but the software tries to be mathematically correct, thus it simply multiplies the total quantity placed to date by price per unit, and rounds that. Not that either approach is very wrong, or that a few cents are worth fighting about (the contracts usually amount to a few million dollars), but the more important thing is that the program is inconsistent with itself. No one wants to trust that shit if p(a) + p(b) + p(c) ≠ p(a + b + c).



  • @havokk said:

    Purify it with fire.
     

    Someone's been playing Asura's Wrath!



  • @shimon said:

    @Bulb said:
    it should be possible to write a perl class that would do the calculations in whole cents internally.

    Unfortunately, that's not so. You have quantities which are measured in different units. Some are measured in tons, but need to be precise within a kilogram. So you have to measure in millicents internally. Some measurements are precise to the 4th digit, some might be even more arbitrary. In other words, you cannot win here totally.

    Well, you don't have to limit it to just two decimal places. You could make a class which stores two values, call them BigUnits and SmallUnits, along with the arbitrary ratio defining the size of a BigUnit compared to that of a SmallUnit. With this system, you could even store feet and inches if you so desired...



  • Suddenly, I've realised the system can be screwed even more... BigUnits, SmallUnits, this way madness lies.



  • I'm going to assume that this madness is something similar to what I'm working on at the moment.

    There are dates,
    There are prices,
    There are even enums,
    Heck, theres even some normalisation (staff member as a numeric)

    However, they are all stored as nvarchar(64)

    and about half of the normalisation is done in code as arrays, for speed, apparently. One of the comments reads: "Rather than querying the server each time, we hard code the records here as its faster."

    I was going to make a new WTF about it, but this topic about wraps it up.

    BTW, Can someone remind me how to do a line feed in this editor?!



  • @shimon said:

    Suddenly, I've realised the system can be screwed even more... BigUnits, SmallUnits, this way madness lies.

    I was actually being serious... :(

    If there aren't any external math libraries that will do this for you, you might as well invent your own, right?



  • For now, I'm resorting to casting everything as decimal and computing at the database side. So far, some things started working faster, as the database is doing the things it was designed for. A steaming pile of shit anyway. I must reverse engineer example invoices as given by client, as they didn't give out the exact methods of rounding things up, and my higher-ups haven't bothered to ask. Needless to say, they also try to minimize my contact with the client.



  • @shimon said:

    Third, it uses MySQL database. Not only that, but MyISAM all over.

    If the application is as old as you say, I wouldn't really call this a WTF. Go back five, ten years ago, and MyISAM was not only the default storage engine, it was also often the fastest and arguably the most usable (stability and functionality wise). For example, until a year or so ago you could not even create fulltext keys in InnoDB. There weren't too many good free alternative RDBMS's around back then either (Postgres being an even worse choice back then). I have been slowly migrating a ~ six year old application from MyISAM to InnoDB lately for obvious reasons, but since production is still running on 5.5 there is simply no way to convert all tables unless we completely rewrite the search function.



  • @FragFrog said:

    @shimon said:
    Third, it uses MySQL database. Not only that, but MyISAM all over.
    If the application is as old as you say, I wouldn't really call this a WTF.
     

    TRWTF is the application being in production that long without having been updated to make use of newer techniques/technology in the marketplace.

    Yeah, I don't agree with "if it ain't broke, don't fix it".



  • @Cassidy said:

    TRWTF is the application being in production that long without having been updated to make use of newer techniques/technology in the marketplace.

    Yeah, I don't agree with "if it ain't broke, don't fix it".

    You should work in a bank, you'd love it there. and I mean the back end, not the websitey stuff.



  • @Cassidy said:

    TRWTF is the application being in production that long without having been updated to make use of newer techniques/technology in the marketplace.

    The thing is, can you explain in layman's terms to your manager why it is vital that your application, which for the past ten years has been running merrily along using MyISAM, should suddenly be converted to InnoDB?

    There are a few good arguments for it (for example, the application I mentioned ran on a multicore server, yet due to MyISAM's multithreading limitations could not make use of those extra cores), but just try explaining data integrity to someone who up to that point never knew it was an issue. Not to mention, MyISAM has been improving just as much as the other engines, it's not like your application was still using technology from a decade ago. You could make a case for upgrading from MySQL to for example SQL Server or Oracle, but due to MySQL's peculiar dialect of SQL, this usually means a ton of rewriting - and again, how do you justify that unless you run into serious problems that can only be solved by switching databases?

    I could make a long case about how rewriting is not always the best solution, but Joel Spolsky does that a lot better than I would.



  • @Mole said:

    You should work in a bank, you'd love it there.
     

    I've taught some sysadmins form the banking sector. They used AIX so there was little modernisation to be had.@Mole said:

    and I mean the back end

    Easy there, tiger....

     



  • @FragFrog said:

    The thing is, can you explain in layman's terms to your manager why it is vital that your application, which for the past ten years has been running merrily along using MyISAM, should suddenly be converted to InnoDB?
     

    No, but I could explain that something running on old kit runs the risk of being unfixable should it break in today's climate when spares for said kit are difficult to source. @FragFrog said:

    There are a few good arguments for it (for example, the application I mentioned ran on a multicore server, yet due to MyISAM's multithreading limitations could not make use of those extra cores), but just try explaining data integrity to someone who up to that point never knew it was an issue.

    Data integrity isn't the approach; cost/benefit analysis is. I would analyse transaction throughput and see if the busines is receiving only 25% of what it's paying for. More modern architecture produces more bang for your buck, so I would show lower operational costs  - cheaper electricity bills.

    @FragFrog said:

    You could make a case for upgrading from MySQL to for example SQL Server or Oracle, but due to MySQL's peculiar dialect of SQL, this usually means a ton of rewriting - and again, how do you justify that unless you run into serious problems that can only be solved by switching databases?

    Firstly, swapping the DB out would only be an option if resource demands exceed the capabilities of MySQL and the cost of a replacement plus licences didn't outweigh financial benefits gained from the new DB.

    Secondly, any peculiarities of the MySQL  dialect should be abstracted away, even if ORMs or SPs didn't exist back in those days I'd still expect a DAL - DB interconnectivity to be delegated out to some library functions which could be swapped over.

    @FragFrog said:

    I could make a long case about how rewriting is not always the best solution

    I could, too. I just find it difficult to imagine something in production has never had any serious effort invested in bringing it in line with modern practises for so long, and the cost of maintaining something so kludgy and fragile has got to be high compared to modern stuff.


Log in to reply