Ternary operators nightmare



  • OK, you have 30 seconds to tell me what this Java method does:



    <font size="2">    public static String sourceCharacter(char s) {

            return s >= '?' || s < '"' ? null : s >= '(' ? s < '<' ? null

                   
    : s != '<' ? s != '>' ? null : "&#62;" : "&#60;"


                   
    : s != '"' ? s != '&' ? s != ''' ? null : "&#39;" : "&#38;"


                           
    : "&#34;";


        }

    </font>

    And? Did you make it?



  • @technites said:

    ...so it's an HTML character reference / NullPointerException generator?




    Yes, kind of, actually it's not HTML but it "<font size="-1">is a
    syntactic framework, like S-expressions or XML". (Google for the quoted
    text, and you'll find it.) There's more WTF to find in this guy's code.
    Sad to say he did not release it yet officially, otherwise I would send
    it to Alex.



    BTW: This sourceCharacter is one of the more intelligent but only unreadable parts of his code.



    cu

    </font>



  • Are ternaries faster to compile or run than 'real' if/else?



  • Oh, the many ways to do the same thing simpler and more readably...


    1. as  previously posted but with a twist:



      switch(s) {

      case '>':
      case '<':

      case ''':

      case '&':
      case '"':

      return "&#" + (int)s + ";"; 
      default: return null;

      }




      2) Variations on the theme, since some hate the switch statement:

      if ( (s == '>')  ||


           (s == '<')  ||

           (s == ''') ||

           (s == '&')  ||

           (s == '"')  )

      {

          return "&#" + (int)s + ";";

      }

      else

      {

          return null;

      }


    2. Maybe just a simpler ternary chain:



      return s == '>'  ? "&#62;" :

             s == '<'  ? "&#60;" :

             s == ''' ? "&#39;" :

             s == '&'  ? "&#38;" :

             s == '"'  ? "&#34;" :

             null;



      4) Or an even simpler ternary operation that avoids the literals:



      return s == '>' || s == '<' || s == ''' || s == '&' || s == '"' ? "&#" + (int)s + ";" : null;



      5) or to avoid the NullPointerException that is sure to follow:



      return s == '>' || s == '<' || s == ''' || s == '&' || s == '"' ?

              "&#" + (int)s + ";" : Character.toString(s);







  • @dhromed said:

    Are ternaries faster to compile or run than 'real' if/else?




    Don't know about compile, and at run time the difference is minimal.
    However ternaries in Java tend to produce longer code paths, hence they
    should be slower.



    Actually the code presented is obviously a manually optimized switch
    statement. The Java-Compiler would create a tableswitch bytecode, that
    would need an average of over two comparisons per call, while the
    optimized version here is close to one comparison per call. But: a) it
    is unreadable and b) it prevents JIT and HotSpot from further
    optimizing the tableswitch when compiling to native code and last but
    not least c) the performance improvement is almost not noticable and
    certainly not worth the effort.



    cu



  • @endergt said:



    return
    "&#" + (int)s + ";";





    Actually this string expression is very inefficient for the purpose at hand.



    If I would want a) best performance and b) still quite readable code, I would do this:



    <font size="2">class ... {

        static final int MAX_INDEX = '>'+1;

        static final String[] sc = new String[MAX_INDEX];

        static {

            sc['>'] = "&#62";

            sc['<'] = "&#60";

            sc['''] = "&#39";

            sc['&'] = "&#38";

            sc['"'] = "&#34";

        }



        String sourceCharacter(char s) {

            if (s > MAX_INDEX) {

                return null;

            } else {

                return sc[s];

            }

        }

    }</font>



    cu



  • @eagle said:

    @dhromed said:
    Are ternaries faster to compile or run than 'real' if/else?




    Don't know about compile, and at run time the difference is minimal.
    However ternaries in Java tend to produce longer code paths, hence they
    should be slower.



    Actually the code presented is obviously a manually optimized switch
    statement. The Java-Compiler would create a tableswitch bytecode, that
    would need an average of over two comparisons per call, while the
    optimized version here is close to one comparison per call. But: a) it
    is unreadable and b) it prevents JIT and HotSpot from further
    optimizing the tableswitch when compiling to native code and last but
    not least c) the performance improvement is almost not noticable and
    certainly not worth the effort.



    cu




    I have no experience with compilers at all, but I'd sort of expect a
    ternary to allow a compiler to optimize compling, since you can always
    be sure that there is only 1 statement after the ? and 1 statement
    after the :



    Attempting to fit in more lines using {} would (in ECMAScript anyway)
    create an unassigned instance of Object() containing obviously invalid
    syntax. :)



    Wouldn't really save seconds or minutes, I imagine.



    But in the end it's always an if, and an if runs as fast as an if.




  • The ternary operator brings up one of the differences between
    functional langauges and procedural/non-functional OO languages.
    Functional languages (I'm thinking here of Lisp, ML, and as a side note
    Ruby) have only one type of conditional, which executes the appropriate
    branch of code and
    returns its value. In C-style languages, there is if/else for the
    branching, and ?: for the value. But, in a functional language, the
    program would usually be written such that the return value is the
    important result, rather than the side effects of the branch.


    Side note: for an example Ruby function 'testvalue', using the
    functional paradigm you would write it like this, because of the chain
    of return values:



    def testvalue(val)

    if val > 10

    true

    else

    false

    end

    end



    And, this is actually faster than the same function written in a procedural mindset:



    def testvalue(val)

    if val > 10

    return true

    else

    return false

    end

    end



    It seems like they ought to be optimized to perform exactly the same, but maybe it will happen in a future version.



  • @eagle said:

    @endergt said:


    return
    "&#" + (int)s + ";";





    Actually this string expression is very inefficient for the purpose at hand.



    If I would want a) best performance and b) still quite readable code, I would do this:



    <font size="2">class ... {

        static final int MAX_INDEX = '&gt;'+1;

        static final String[] sc = new String[MAX_INDEX];

        static {

            sc['&gt;'] = "&amp;#62";

            sc['&lt;'] = "&amp;#60";

            sc['''] = "&amp;#39";

            sc['&amp;'] = "&amp;#38";

            sc['"'] = "&amp;#34";

        }



        String sourceCharacter(char s) {

            if (s &gt; MAX_INDEX) {

                return null;

            } else {

                return sc[s];

            }

        }

    }</font>



    cu


    ???

    This is a really nice approach, readable and performant, and it will work well after you debug it.  MAX_INDEX is misnamed; it's actually the length of the array, not the maximum index.  So s > MAX_INDEX is the wrong condition; it should either be s >= MAX_INDEX, or just s >= sc.length, which would have avoided the issue (the array always knows its length; your variable may or may not be right, and in this case it's wrong).

    Also, you left the semicolons off of the HTML character entities.  With these things fixed, the code is (assuming I can figure out how to post code on this forum software):

    <font size="2">class ... {

        static final int MAX_INDEX = (int)'&gt;';

        static final String[] sc = new String[MAX_INDEX+1];

        static {

            sc['&gt;'] = "&amp;#62;";

            sc['&lt;'] = "&amp;#60;";

            sc['''] = "&amp;#39;";

            sc['&amp;'] = "&amp;#38;";

            sc['"'] = "&amp;#34;";

        }



        String sourceCharacter(char s) {

            if (s &gt;= sc.length) {

                return null;

            } else {

                return sc[s];

            }

        }

    }</font>




  • @DrCode said:



    ???

    ... (assuming I can figure out how to post code on this forum software):

    <font size="2"></font>


    Apparently, I can't figure out how to post code on this forum software.  Hopefully you can work out what I was trying to show.

    Also, I should explain those three ??? in my post.  '?' is the character the code with the off-by-one error would choke on.  (Try it.)



  • @DrCode said:

    This is a really nice approach, readable and
    performant, and it will work well after you debug it.  MAX_INDEX
    is misnamed; it's actually the length of the array, not the maximum
    index.  So s > MAX_INDEX is the wrong condition; it should
    either be s >= MAX_INDEX, or just s >= sc.length, which would
    have avoided the issue (the array always knows its length; your
    variable may or may not be right, and in this case it's wrong).




    Well, let's not argue about the naming of MAX_INDEX, it's not the ideal
    name, but what do you expect when writing sample code in a web form
    editor within a few seconds? Same goes for the off-by-one error, where
    you are certainly correct, that it should have been s >= MAX_INDEX.
    However, using MAX_INDEX instead of sc.length was done intentionally to
    achieve best performance. (see above point a) ). And for the missing
    semicolons, those are cut&paste-replicated typos I overlooked.
    Anyway, thanks for debugging my code, it seems at least one person took
    the time to read my post.



    cu



  • @eagle said:

    OK, you have 30 seconds to tell me what this Java method does:



    <font size="2">    public static String sourceCharacter(char s) {

            return s >= '?' || s < '"' ? null : s >= '(' ? s < '<' ? null

                   
    : s != '<' ? s != '>' ? null : ">" : "<"


                   
    : s != '"' ? s != '&' ? s != ''' ? null : "'" : "&"


                           
    : """;


        }

    </font>

    And? Did you make it?




    And the answer is (took me 5 seconds): Confuse the reader.



    Now that was simple.



Log in to reply