Reinventing pretty much everything



  • Found my first WTF in a system wrote in C++ that we have:

    #define BOOL int
    #define TRUE 1
    #define FALSE 0

    ... ... ...

    BOOL stringstartswith(const char* str1, const char* str2)
    {
        if (strlen(str2) > strlen(str1))
        {
            return FALSE;
        }

        BOOL equal = TRUE;

        while (*str2 && equal)
        {
            if (*str1 != *str2)
            {
                equal = FALSE;
            }

            str1++;
            str2++;
        }

        return equal;
    }

    Nice to see that they have reinvented the boolean and didn't realise that they could use strstr("xxyyyxx", "xx") == 0



  • @alias said:


    Nice to see that they have reinvented the boolean and didn't realise that they could use strstr("xxyyyxx", "xx") == 0


    The BOOL thingy is obviously a useless WTF.

    If speed matters, they should get rid of those expensive calls to strlen; otherwise, in terms of speed, it's not too bad.

    strstr(x,y)==0 is not a good idea if speed matters.



  • bool beginswith( char * a , char * b)
    {
    	while( *a==*b && *a++)b++;	
    	return !*b;
    }
    


  • <font face="Courier New" size="2">#include <cstring>


    bool stringstartswith(const char* str, const char* </font><font face="Courier New" size="2">startswith</font><font face="Courier New" size="2">)

    {

        return 0 == std::strncmp(str, </font><font face="Courier New" size="2">startswith, </font><font face="Courier New" size="2">strlen(</font><font face="Courier New" size="2">startswith</font><font face="Courier New" size="2">));

    }



    <font size="3"><font face="Arial"><font face="Times New Roman">If you
    have a decent compiler/Run Time library, built-in functions are usually
    well optimised and probably faster than anything you can write.



    Michael J </font>





    </font></font></font>



  • A lot depends on how long the startswith string is. The break even point for the strncmp/strlen version is probably quite a long string in reality. So it reallys depends on how the application uses the function.



  • The definition of BOOL is just a holdover from C, where it's quite
    common to do that.  Even the ancient and venerable Xlib.h does it.



    I am inclined to think it's just cruft --- very old code that no one
    ever bothered to refactor out of whatever other code is calling
    it.  I'm pretty sure strstr and strncmp were later additions to
    string.h, so at one time it might have made some sense.



    I know of a project that's been using Java 1.4 for years but still uses
    a third-party regex package, because they just didn't feel like
    updating the code that uses regex.



  • @asuffield said:

    @qbolec said:
    bool beginswith( char * a , char * b)
    {
    while( *a==*b && *a++)b++;
    return !*b;
    }

    bool beginswith(const char *str, const char *suffix)
    {
      size_t suffix_len = strlen(suffix);
      return strncmp(str, suffix, suffix_len) == 0;
    }
    

    It's not just easier to read, it's probably faster (most compilers have optimised versions of the string functions - the naive while loop is actually quite slow on modern processors, up to a factor of 10 on early athlons).



    It's easy to create a case where this version is slower (much slower), no matter how well optimized the runtime libs are - choose "a" for str and "bbb(100000 times)..b" for suffix. strlen(suffix) will eat any advantage from the libs. (Though any real-world usage of this function will most likely use a short string for suffix)



  • @Michael J said:

    <font face="Courier New" size="2"></font><font face="Courier New" size="2">

    <font size="3"><font face="Arial"><font face="Times New Roman">If you
    have a decent compiler/Run Time library, built-in functions are usually
    well optimised and probably faster than anything you can write. </font>

    </font></font></font>


    I agree that it's best not to re-invent the wheel, and that built-in functions are usually well-optimzed.

    However, it is entirely possible to write a function that's as fast as, if not faster than, a library function. For example, qbolec's function is about as optimal as you can get. I doubt it's even possible to get the same result in fewer processor cycles.



  • Ok, the forum won't let me edit my post. It's telling me I hit the edit time limit, after only a few minutes. WTF?

    Anyway, here's what I was going to change to:



    @Michael J said:

    <font face="Courier New" size="2">

    <font size="3"><font face="Arial"><font face="Times New Roman">If you
    have a decent compiler/Run Time library, built-in functions are usually
    well optimised and probably faster than anything you can write. </font>

    </font></font></font>


    I agree that it's best not to re-invent the wheel, and that built-in functions are usually well-optimzed.

    However,
    it is entirely possible to write a function that's as fast as, if not
    faster than, a library function. For example, qbolec's function is
    about as optimal as you can get.



    @asuffield said:


    It's not just easier to read, it's probably faster (most compilers have
    optimised versions of the string functions - the naive while loop is
    actually quite slow on modern processors, up to a factor of 10 on early
    athlons).




    This is possible, I guess, if the code wasn't pipelined. However, most
    modern compilers will take care of this for you. They'll unroll the
    loop a little in order to try and get more instructions executing at
    once.



    However, this shouldn't change the C++ code that's used to define the
    function. If there's any better way to do this than a simple loop, I
    can't think of it.



  • @eimaj2nz said:



    This is possible, I guess, if the code wasn't pipelined. However, most
    modern compilers will take care of this for you. They'll unroll the
    loop a little in order to try and get more instructions executing at
    once.



    However, this shouldn't change the C++ code that's used to define the
    function. If there's any better way to do this than a simple loop, I
    can't think of it.


    In VS2005, at least, strstr (and a lot of string.h) appears to be written in assembler, which would allow for a bit more tweaking over what the compiler would easily allow.  I don't know if this is the case for the C++ string functions.  Maybe he needed a slower version?



  • @eimaj2nz said:



    (... beginning snipped by MJ)

    @Michael J said:
    <font face="Courier New" size="2">

    <font size="3"><font face="Arial"><font face="Times New Roman">If you
    have a decent compiler/Run Time library, built-in functions are usually
    well optimised and probably faster than anything you can write. </font>

    </font></font></font>

    I agree that it's best not to re-invent the wheel, and that built-in functions are usually well-optimzed.

    However,
    it is entirely possible to write a function that's as fast as, if not
    faster than, a library function. For example, qbolec's function is
    about as optimal as you can get.



    (... rest snipped by MJ)




    <font face="Times New Roman" size="3">My assembler programming is a bit
    rusty, so I can't say for sure.  I've seen very fast
    implementations for std::memcpy() that ut</font><font face="Times New Roman" size="3">ilised special op-codes for fast copying of data.  This was much faster </font><font face="Times New Roman"><font face="Times New Roman" size="3">than
    a naïve loop implementation.  Knowing the specialised op-codes and
    the way the optimiser works can yield some amazing speed improvements.



    Whether this can be done for std::strncmp() or not, I don't know. 
    In general, I prefer to use system functions than to roll my own as
    they've usually been written by smart folks who know the platform
    intimately - but that's just my prejudice.



    As with any advice I give, this is probably worth every penny you paid for it.</font>



    <font face="Times New Roman" size="4">Michael J</font>

    </font>



  • @TheDauthi said:


    In VS2005, at least, strstr (and a lot of string.h) appears to be written in assembler, which would allow for a bit more tweaking over what the compiler would easily allow.  I don't know if this is the case for the C++ string functions.  Maybe he needed a slower version?


    Even if the assembler version strstr is much faster that a hand-written version, you should not use it to implement startsWith(). It's stupid to search the whole  (probably large) string for the prefix, just to compare if it is found at the beginning.



  • @ammoQ said:

    @TheDauthi said:

    In VS2005, at least, strstr (and a lot of string.h) appears to be written in assembler, which would allow for a bit more tweaking over what the compiler would easily allow.  I don't know if this is the case for the C++ string functions.  Maybe he needed a slower version?


    Even if the assembler version strstr is much faster that a hand-written version, you should not use it to implement startsWith(). It's stupid to search the whole  (probably large) string for the prefix, just to compare if it is found at the beginning.

    Ok, agreed, my mistake.  I was thinking in terms of the earlier "However, this shouldn't change the C++ code that's used to define the function," not in terms of solving the problem above.

    Missing-the-forest-for-the-trees syndrome.

    Offhand, since everyone's having a go, I'd probably try the following

    bool startsWith(const char* a, const char* b)
    {
            //which is probably buggy to embarass myself
    	int i = strspn(a, b);//it is strspn and not strsnp, right?
    	return (b[i] == 0) && (i > 0);
    }

    and when that didn't work, write the original WTF =)

    Edit: Immediately realized that it's not strspn that I was thinking of. Ah, well. I was pretty confident that there was a standard function that did almost exactly this...



  • @alias said:

    Found my first WTF in a system wrote in C++ that we have:

    #define BOOL int
    #define TRUE 1
    #define FALSE 0

    ... ... ...



    Maintenance code I presume.
    My good fer nuthing engineering school apparently had Turbo C++ 1988 build (sic) installed on all machines and students had to write the same code to make the compiler 'recognize' bool values.
    Needless to say, some of the students found internships to be pretty exhausting when they had to relearn most of the concepts.
    Some of the students even went to the extent of asking Turbo C++ to be installed on their workstations at the companies instead of using gcc or VC++. Arrh!
    Come to think of it, some of the chaps were just not made for engineering.


Log in to reply