Code re(ab)use



  • I'm in the midst of a huge refactoring of several 'legacy' Java libraries/projects which have grown somewhat organically since around the turn of the century (its fun to say that, even though its only 9-11 years ago).  Note: I've submitted several WTFs from these sources via the submit tool and nothing has ever bubbled up to the front page, but I feel the need to share these.

    I found this today --

    /**
     * Statistics.java (circa 2001)
     * - note: just a snippet to save sanity
     */ 

    public static double average(List values) {

      int i;
      double returnValue;
      double sum;
      Double value;

      sum = 0.0;
      for (i = 0; i < values.size(); ++i) {
        value = (Double) values.get(i);
        sum += value.doubleValue();
      }

      returnValue = sum / values.size();
      return (returnValue);

    }

    Not terrible, no. No autoboxing faux pas. But could certainly be faster (maybe significantly so based on 'n') if it were a simple array of double instead of List<Double> (essentially.)

    So, I use the magical "Find Usages" in NetBeans, to see who uses it, of course.

    Not entirely to my amazement, the only -- repeat ONLY -- other reference to this code resides in a conspicuously named method also inside Statistics.java:

    public static double average(double value1, double value2) {

      List values;

      values = new Vector();
      values.add(new Double(value1));
      values.add(new Double(value2));

      return (Statistics.average(values));

    }

    *HEADDESK*

    (P.S. What's the magic tag to do 'code' blocks? [code][/code] didn't seem to do anything appropriate.)



  • Sounds like it was a function that someone used at some time, and they added an overload to make it easier to average two numbers, rather than needing to instantiate a new list (or array, if you prefer). What's the WTF? That they didn't remove the methods when they were no longer used?



  • @toth said:

    What's the WTF? That they didn't remove the methods when they were no longer used?

    That they go thru the machinations of creating a list, converting two doubles to Doubles, inserting them into the list, then calling the other function which then got each Double out of the list, converted it to double and added it to the running sum, then divided it by the size of the list.

    When, to average two numbers, it could be something as simple as:

    return (value1 + value2) / 2;

    Surely, this is a simpler, clearer implementation over what is there?

     


  • 🚽 Regular

    It's a question of compromising code factoring with optimization, really. If this function were called millions of times in some process or command, then you make a good case; it should be made as simple as possible even if you sacrifice some code reuse for the sake of performance. Otherwise, the reuse that's used here is pedantically valid because there is still a single source of the equation: If all of a sudden the laws of arithmetic changed and averages were determined by using another formula, you have only one place in the codebase to change it.

    However, if the function were private, then you'd have a valid case that, assuming one only uses the second method, the first method is silly because unless you change the functionality of the class you will NEVER call the first function with any more or less than two items in the list. By making the two methods public, however, you have no guarantee who else is using these functions.

    Of course, a lot of the WTFs here show abuses of that, by declaring constants FALSE:bool = false; and ZERO:int = 0;  and one could make the case that calculating an average is something simple enough that it's silly to dictate that there should be only one source of functionality. If this were an equation or algorithm that used more arbitrary business logic that could someday change, then it's perfectly reasonable to have a single point where that algorithm is applied, and have anything else, regardless of how simple it is, use that single point to get the result.



  • @zelmak said:

    What's the magic tag to do 'code' blocks?
     

    Paste normally, then go into the HTML and turn P into PRE.

    Unless you have Chrome.



  • @dhromed said:

    @zelmak said:
    What's the magic tag to do 'code' blocks?

    Paste normally, then go into the HTML and turn P into PRE.


    Unless you have Chrome.

    I don't use Chrome, and this doesn't work for me.

    In fact, I cannot follow your instructions. What do you mean by 'go into
    the HTML'? What else is there? I mean, I know it takes BB code; it
    exposes that on the reply page, for anything the prior poster quoted.

    However, when I look at the above post with code on the reply page, I
    see nothing special that enables the indented text. It's clearly not
    abusing <ul> tags, because it doesn't change the colorizing of the
    text like that does.

    Basically, I'm confused and puzzled.



  • The last button on the toolbar when replying is marked [HTML]. Click that one ahd hey presto.



  • @zelmak said:

    @toth said:

    What's the WTF? That they didn't remove the methods when they were no longer used?

    That they go thru the machinations of creating a list, converting two doubles to Doubles, inserting them into the list, then calling the other function which then got each Double out of the list, converted it to double and added it to the running sum, then divided it by the size of the list.

    When, to average two numbers, it could be something as simple as:

    return (value1 + value2) / 2;

    Surely, this is a simpler, clearer implementation over what is there?

     

    I'll give you boxing the doubles, but that's a minor WTF at best (worst?). The way I would guess that this came about is that the author wrote the general average function (perhaps in a case of premature generalization, perhaps because something was actually using it to average many numbers), but found himself averaging just two numbers frequently enough that it became reasonable to encapsulate that in an overload. Perhaps you (like my boss) have some irrational aversion to convenience overloads, but I think it's a perfectly reasonable thing to do, assuming my guess at its origin is correct. Yes, return (value1 + value2) / 2 is a much cleaner implementation--assuming you never want to average more than two numbers. If you want to average an arbitrary number of numbers, but often just average two, then the implementation you showed (modulo some cleaning up of the boxing/unboxing stuff) is perfectly reasonable.




    Of course, I would prefer a variable length parameter list, if you're using a version of Java that supports it.



  • @toth said:

    Yes, return (value1 + value2) / 2 is a much cleaner implementation--assuming you never want to average more than two numbers. If you want to average an arbitrary number of numbers, but often just average two, then the implementation you showed (modulo some cleaning up of the boxing/unboxing stuff) is perfectly reasonable.

    I think the point is that the convenience overload should use the cleaner implementation. There's no particular reason why it needs to call the other method or use its implementation.



  • @Someone You Know said:

    @toth said:


    Yes, return (value1 + value2) / 2 is a much cleaner implementation--assuming you never want to average more than two numbers. If you want to average an arbitrary number of numbers, but often just average two, then the implementation you showed (modulo some cleaning up of the boxing/unboxing stuff) is perfectly reasonable.

    I think the point is that the convenience overload should use the cleaner implementation. There's no particular reason why it needs to call the other method or use its implementation.

    I can see where you're coming from, and in such a simple case as finding the mean of some numbers, you may be right. However, I'm always extremely wary of duplicate code--no matter how simple. Sure, the formula for finding the mean is probably not going to change unless we somehow slip into an alternate reality. But what if the business requirements change? What if instead of a simple mean average, you have to implement some sort of weighted average? Or you're going to start returning the median instead?* Rather than having one piece of code change, you have two. Not a huge deal, perhaps, but it's worse than having one.





    • Okay, it's extremely unlikely that you'll be wanting to find the median of two numbers, and if you do, it happens that it's the same as the mean, but my general point stands.

  • 🚽 Regular

    A weighted average or other business logic would not belong in a place like Statistics.average(), assuming Statistics is a general class. If you needed special business rules for a project, it should be defined in a more specific implementation, such as InitechApplicationStatistics.average() (or even InitechApplicationStatistics.weightedAverage()) or TPSReportStatistics.average(). If Statistics is a general, more abstract class, then Statistics.average() should only perform what's expected of it in a generalized setting. Otherwise, if there's someone else in the IT department who is developing a completely different application which uses different business rules that expect a more traditional average algorithm, Statistics.average() will work, and yet still if the method to get an average is different than Statistics AND InitechApplicationStatistics.average(), they can create their own InitrodeApplicationStatistics.average() and everything will still be right in the world.



  • @RHuckster said:

    A weighted average or other business logic would not belong in a place like Statistics.average(), assuming Statistics is a general class. If you needed special business rules for a project, it should be defined in a more specific implementation, such as InitechApplicationStatistics.average() (or even InitechApplicationStatistics.weightedAverage()) or TPSReportStatistics.average(). If Statistics is a general, more abstract class, then Statistics.average() should only perform what's expected of it in a generalized setting. Otherwise, if there's someone else in the IT department who is developing a completely different application which uses different business rules that expect a more traditional average algorithm, Statistics.average() will work, and yet still if the method to get an average is different than Statistics AND InitechApplicationStatistics.average(), they can create their own InitrodeApplicationStatistics.average() and everything will still be right in the world.

    Fair enough. Mine was a very bad example, but I still tend to favor code reuse, even in simple cases like this.



  • @Someone You Know said:

    I think the point is that the convenience overload should use the cleaner implementation. There's no particular reason why it needs to call the other method or use its implementation.

    This ^^ ... Calling the 'generalized list' version was complete and utter overkill for averaging two doubles. I thought that was clear ... I guess not.

    @toth said:

    Fair enough. Mine was a very bad example, but I still tend to favor code reuse, even in simple cases like this. 

    								    </p><p></blockquote></p><p>Oh, believe me, generally, I'm with you on this. In this same exact app, there's copypasta and tons of needless duplication.</p><p>I just thought in this case, the reuse was more abusive/overhead to the poor schmuck who had to implement the two number version packed on top of a List version.<br></p>

Log in to reply