Fractal Reporting



  • TLDR: The symptoms, diagnoses and remedies for a Selenium build.

    At WTF Inc, we test our AJAX application via Selenium RC. The SeRC run is written in a Maven project that has some custom ANT tasks that allow Hudson to execute the Se1.0, writes out some JUnit-style report XML, and then transmutes that into pretty reports. The Pointy-Haired Boss ordained that we were to use SeRC, ordained that it was to integrate into our existing Hudson builds, and that a pretty report would be available for him each morning -- or rather, lunchtime, since the execution would take about six hours (it was a VERY complicated webapp). None of this is TRWTF. (In short: ANT is Java's version of Make, Maven is Java's dependency manager.)

    The Senior Developer who was responsible for making this work is on-the-record as saying "I'm too senior to write test harness code", although he vanished into the layoffs. Since I was responsible for the tests, I also became responsible for maintaining the Hudson harness. I soon discovered that Senior Developer implemented a "Execute these named files" strategy for discovering the test files. I quickly modified the strategy to "All files in these directories", since the number of test scripts was increasing faster than the rabbit population. After this modification the engine continued fine for several weeks, until the massive increase in scripts inevitably resulted in an OutOfMemoryError. A quick increase of memory to the VM and everything was fine. Only 2 weeks later the OutOfMemory errors kept coming and we realised we'd have to study the inner workings of the beast.

    I started at the ANT tasks that were executed. The first warning klaxon that went off was when I realised that the Senior Developer had effectively written the SQL task from scratch. So rather than use the built-in SQL implementation in ANT for the data injection, he rolled his own. I quickly culled his custom code out of the Hudson harness and replaced it with the standard implementation. I soon saw what I thought was the cause of all these memory problems: the JUnit XML report format demands that the duration of the the test suite be written before the results of the individual tests steps, could it be that the system was holding onto the flyweights in memory too long? I was reluctant to rewrite the report generation code, so I carefully examined the entire system even closer.

    The real problem made my head spin. The algorithm our Senior Developer used to execute the tests went like this:

    For every directory in the path:

    • For every test file in the directory:

      • Read the contents of the Selenium files into an in-memory DOM.

      • For each table line in the DOM, extract the Selenium command and send it to the Selenium executor.

      • Write the test results for the file

    • Examine all the test results in the results output folder and make pretty reports.

    Just so we're clear: reading your *ML files into a DOM requires a high memory footprint, and is generally only done when the tags are going to be edited. Since in this case all we were doing was reading the file, I replaced the DOM construction with a STAX reader that simply opened the file, found the appropriate characters, and then closed the thing without any additional fuss. The result was that instead of performing one Selenium test step per second, our Hudson harness started performing two every second, and it didn't throw an OutOfMemoryError 5 minutes into the build.

    In that respect the STAX reader was a success, but the SeRC run would still fail with an OutOfMemoryError after an hour. Something was still innately wrong with the test harness. It seemed to hang when the reports were being generated, so I took a closer look at the report generation code. The Senior Developer had placed a 3rd-party library on the Hudson harness to handle the publication of the pretty reports -- as in "downloaded the project's source code and then checked it into the harness's source control". And SVN Blame confirmed that he had made no modifications to the 3rd party code: apparently he didn't know how to put the jar on the classpath (or add it as a dependency, or any other way of putting in 3rd party code). This 3rd-party code was essentially XSLT engine -- as in, this is the sort of code you include when you want to develop your own XSLT engine, not when you simply want to make an XML file look pretty.

    As cumbersome and inefficient as this programmatic XSLT was, he might've gotten away with it if he'd examined his algorithm more closely. Look again at the pseudocode above.

    Each directory on the test path was for a specific module: /ModuleA for tests relating to Module A, /ModuleB for tests relating to Module B, you get the idea. When the harness completed all the tests for Module A, it would look in the results directory and publish the reports. It would do the same when it completed Modules B, C, etc.... Each time it would look into the result directory and transform the results XML into pretty HTML. But because it was blindly publishing the results of EVERYTHING in the results directory, it would re-publish existing reports without blinking. When only Module A results were there that was fine; when Module B was complete, the system would publish the reports for Module A (again) as well as Module B. When Module C was complete the system would publish reports for Module A (again), Module B (again) and Module C. This would continue through all 26 of our modules (which typically had 20+ tests in each). No wonder it was taking 10 minutes to complete the reporting phase of some of the later modules.

    I did end up replacing the JUnit XML format with our own home-grown variety: one that only required the duration of each individual step to be recorded in the XML. I also modified the reporting engine so that it used a STAX writer to write the results into an XML file on the fly, and only performed report publication once the last test had been executed. Report publication consisted of slicing that single results.xml file into smaller XML files with an XSLT in the header -- the web browser did all the hard work of making them pretty. The STAX harness could do 3, perhaps 4, Selenium instructions every second and its memory usage remained constant, no matter how many tests we added.

    But my favourite bug from the Hudson harness? The Senior Developer directed all screenshots to be saved as 'Screenshot.jpg', even though the screenshots were always PNGs and more than one screenshot would be taken during the execution. I instructed the harness to use the test name and timestamp as the filename, ensuring none of the previous screenshots were overwritten.



  • @KarenM said:

    The Senior Developer who was responsible for making this work is on-the-record as saying "I'm too senior to write test harness code"

    So I tied an onion to my belt. Which was the style at the time. Now, to take the ferry cost a nickel, and in those days, nickels had pictures of bumblebees on 'em.


  • So in summary:

    @summary said:

    AJAX, Selenium RC, SeRC, Maven, ANT, Hudson, Se1.0, JUnit, XML, SeRC, Hudson, ANT, Java, Make, Maven, Java. Hudson, VM. ANT, klaxon, SQL, ANT, Hudson, JUnit, XML, flyweights.

      • Selenium, DOM.
      • DOM, Selenium, Selenium executor.

    *ML, DOM, DOM, STAX, Selenium, Hudson. STAX, SeRC, Hudson, SVN Blame, jar, classpath, XSLT, XSLT, XML. XSLT, psuedocode. XML, HTML. JUnit XML, XML, STAX, XML, XML, XSLT, STAX, Selenium.

    Hudson? PNGs.



  • @blakeyrat said:

    So in summary:

    @summary said:

    AJAX, Selenium RC, SeRC, Maven, ANT, Hudson, Se1.0, JUnit, XML, SeRC, Hudson, ANT, Java, Make, Maven, Java. Hudson, VM. ANT, klaxon, SQL, ANT, Hudson, JUnit, XML, flyweights.

      • Selenium, DOM.
      • DOM, Selenium, Selenium executor.

    *ML, DOM, DOM, STAX, Selenium, Hudson. STAX, SeRC, Hudson, SVN Blame, jar, classpath, XSLT, XSLT, XML. XSLT, psuedocode. XML, HTML. JUnit XML, XML, STAX, XML, XML, XSLT, STAX, Selenium.

    Hudson? PNGs.

    It's WTF salad!



  • @morbiuswilters said:

    @blakeyrat said:
    So in summary:

    @summary said:

    AJAX, Selenium RC, SeRC, Maven, ANT, Hudson, Se1.0, JUnit, XML, SeRC, Hudson, ANT, Java, Make, Maven, Java. Hudson, VM. ANT, klaxon, SQL, ANT, Hudson, JUnit, XML, flyweights.

      • Selenium, DOM.
      • DOM, Selenium, Selenium executor.

    *ML, DOM, DOM, STAX, Selenium, Hudson. STAX, SeRC, Hudson, SVN Blame, jar, classpath, XSLT, XSLT, XML. XSLT, psuedocode. XML, HTML. JUnit XML, XML, STAX, XML, XML, XSLT, STAX, Selenium.

    Hudson? PNGs.

    It's WTF salad!

    Worse, it's only the entrée salad. Feel lucky that you don't have to examine the main course menu....



  • @blakeyrat said:

    @summary said:
    Hudson? PNGs.

    I had no idea what is Hudson (I don't work with poor people technologies) so I went to the website and gave up immediately when I saw their logo/persona/avatar:







    When they get pictures that don't look like a WMF clipart that was shipped with MS-Write I'll reconsider reading the product overview.



  • Hudson is the automated build tool we use for the nightly builds at WTF Inc. And yes, it's what people use when they can't afford a REAL build engine.


  • Discourse touched me in a no-no place

    @Ronald said:

    I had no idea what is Hudson
    It's a continuous integration system, that is, a system to automatically run builds when triggered (by a SCM change, timer, upstream rebuild, explicit prod, etc.) For various reasons ugly reasons (i.e., Oracle), people are more likely to favor Jenkins instead, which has a different stupid clipart logo.



  • @KarenM said:

    The Pointy-Haired Boss ordained that we were to use SeRC, ordained that it was to integrate into our existing Hudson builds,and that a pretty report would be available for him each morning -- or rather, lunchtime, since the execution would take about six hours (it was a VERY complicated webapp).

    Here is an interesting formula that I devised after many years of silent contemplation:


    $start_time = $the_time_you_need_something - $the_time_it_takes_to_run_the_task
    




    Maybe you could apply this formula in your organization to give people the reports they want at the time they want them. Or you could just keep calling them PHBs and get away with nonchalant scheduling.



  • @dkf said:

    @Ronald said:
    I had no idea what is Hudson
    It's a continuous integration system, that is, a system to automatically run builds when triggered (by a SCM change, timer, upstream rebuild, explicit prod, etc.) For various reasons ugly reasons (i.e., Oracle), people are more likely to favor Jenkins instead, which has a different stupid clipart logo.

    This is hilarious.







    at least it makes it immediately obvious that the product is a piece of shit.



  • @Ronald said:

    This is hilarious.







    at least it makes it immediately obvious that the product is a piece of shit.

    Oh, come off it, it's not that bad. It's less silly than most of the logos Mozilla has come up with..

    Anyway, Jenkins isn't that bad. You're not going to find a better FOSS build server. And I don't know what commercial products you're comparing it to, but it probably does okay there, too.



  • "Oh no, these two logos use a similar art style to Microsoft Word Clipart, which is bad even though The people who made the Office Clipart are probably rich off of the royalties"



  • @morbiuswilters said:

    Oh, come off it, it's not that bad.
     

    Its logotype is Georgia. And really poorly kerned.



  • @dhromed said:

    @morbiuswilters said:

    Oh, come off it, it's not that bad.
     

    Its logotype is Georgia. And really poorly kerned.

    what about this one





  • @Ronald said:

    @dhromed said:

    @morbiuswilters said:

    Oh, come off it, it's not that bad.
     

    Its logotype is Georgia. And really poorly kerned.

    what about this one



     

    fuuuuuck

     



  • @KarenM said:

    When only Module A results were there that was fine; when Module B was complete, the system would publish the reports for Module A (again) as well as Module B. When Module C was complete the system would publish reports for Module A (again), Module B (again) and Module C. This would continue through all 26 of our modules (which typically had 20+ tests in each). No wonder it was taking 10 minutes to complete the reporting phase of some of the later modules.
    Just wait till you have to add a 27th module.  Should it be called AA?  or A1?  Or Bob?  Much hilarity ensues.



  • @El_Heffe said:

    @KarenM said:

    When only Module A results were there that was fine; when Module B was complete, the system would publish the reports for Module A (again) as well as Module B. When Module C was complete the system would publish reports for Module A (again), Module B (again) and Module C. This would continue through all 26 of our modules (which typically had 20+ tests in each). No wonder it was taking 10 minutes to complete the reporting phase of some of the later modules.
    Just wait till you have to add a 27th module.  Should it be called AA?  or A1?  Or Bob?  Much hilarity ensues.

    À, Â, Ä...



  • @blakeyrat said:

    So in summary:

    @summary said:

    AJAX, Selenium RC, SeRC, Maven, ANT, Hudson, Se1.0, JUnit, XML, SeRC, Hudson, ANT, Java, Make, Maven, Java. Hudson, VM. ANT, klaxon, SQL, ANT, Hudson, JUnit, XML, flyweights.

      • Selenium, DOM.
      • DOM, Selenium, Selenium executor.

    *ML, DOM, DOM, STAX, Selenium, Hudson. STAX, SeRC, Hudson, SVN Blame, jar, classpath, XSLT, XSLT, XML. XSLT, psuedocode. XML, HTML. JUnit XML, XML, STAX, XML, XML, XSLT, STAX, Selenium.

    Hudson? PNGs.

    No, be fair, this is the summary of the actual problem:

    @summary said:
    Schlemiel the Painter's algorithm



  • @dhromed said:

    @morbiuswilters said:

    Oh, come off it, it's not that bad.
     

    Its logotype is Georgia. And really poorly kerned.

    It's a logo. How many of them look good? Especially for FOSS projects? This is like you refusing to buy a romance novel because the lusty male hero on the front doesn't have his tunic ripped in a convincing-enough manner. At least judge it on its legitimate faults.



  • @morbiuswilters said:

    @dhromed said:

    @morbiuswilters said:

    Oh, come off it, it's not that bad.
     

    Its logotype is Georgia. And really poorly kerned.

    It's a logo. How many of them look good? Especially for FOSS projects? This is like you refusing to buy a romance novel because the lusty male hero on the front doesn't have his tunic ripped in a convincing-enough manner. At least judge it on its legitimate faults.


  • Discourse touched me in a no-no place

    @morbiuswilters said:

    How many of them look good? Especially for FOSS projects?
    To be fair, most commercial logos also look terrible. What's worse, when the logo looks OK it's usually a sign of someone who's obsessed over that part so much that everything else about the product is horrible.

    I don't program with logos. (Or Logo; I don't need constructivist turtles.)



  • @dkf said:

    (Or Logo; I don't need constructivist turtles.)

    Oh. My. God. I'd forgotten about being in elementary school and playing with the "turtle". That was so much fun..



  • @dhromed said:

    @Ronald said:

    @dhromed said:

    @morbiuswilters said:

    Oh, come off it, it's not that bad.
     

    Its logotype is Georgia. And really poorly kerned.

    what about this one



     

    fuuuuuck

     

    I'm a little surprised nobody has come out with something called Jeeves.

    Of course, the implication is then that [i]your[/i] software is Bertie Wooster...

    So perhaps I shouldn't be so surprised.


  • Trolleybus Mechanic

    @Hatshepsut said:

    I'm a little surprised nobody has come out with something called Jeeves.
     

    You weren't on The Internet before 2005, were you?

    [img]http://images6.fanpop.com/image/photos/32200000/Ask-Jeeves-whatever-happened-to-32225327-270-301.jpg[/img]

    That piece of shit became the malware herpes that is Ask.com



  •  @Hatshepsut said:

    I'm a little surprised nobody has come out with something called Jeeves.

    A shady deal between Oracle and Ask, that appears to produce no tangible benefits for either, means a thing that used to be called "Jeeves" is bundled with every Java update.



  • @Lorne Kates said:

    @Hatshepsut said:

    I'm a little surprised nobody has come out with something called Jeeves.
     

    You weren't on The Internet before 2005, were you?

    That piece of shit became the malware herpes that is Ask.com

    Well, I was talking about the Jenkins/Hudson type of application...


Log in to reply