A shortcut for resetting member variables, except not.



  • Just stumbled over something ugly in our code base:

    void SomeClass::reset() {
      memset((char*)this, 0x00, sizeof(SomeClass));
      timestamp = 0.0; datestamp = 0;
      // some more initialization
    }
    

    This method is called by the constructor and in at least 2 other places in the code. If it wasn't followed by initialization I'd have just chalked it up to someone being too lazy to set member variables who didn't know that memset is not a good way to do this with classes. But I'm completely stumped as to what the thought process was here.

    ETA: This is C++ by the way.


  • Discourse touched me in a no-no place

    @witchdoctor said:

    void SomeClass::reset() {
    memset((char*)this, 0x00, sizeof(SomeClass));
    timestamp = 0.0; datestamp = 0;
    // some more initialization
    }
    I presume there are no virtual members in that class? Nor will there ever be.



  • I can tell one important thing about SomeClass. There is no virtual inheritance involved anywhere in it, because that requires invisible pointers (or offsets, meh) inside the virtual parents so that they can find the shared virtual base.

    And the other thing is that there aren't any virtual functions, because the memset would destroy the vtable pointer.



  • @PJH said:

    I presume there are no virtual members in that class? Nor will there ever be.
    I thought that didn't matter. It will just overwrite all data members with 0, which works in most cases. I'd say it is some kind of lazy anticipation that the class will grow, and the other line was added by someone else.



  •  @PJH said:

    @witchdoctor said:

    void SomeClass::reset() {
      memset((char*)this, 0x00, sizeof(SomeClass));
      timestamp = 0.0; datestamp = 0;
      // some more initialization
    }
    

    I presume there are no virtual members in that class? Nor will there ever be.

     

    Fortunately there are no virtual members in the class. And since it looks like there won't be any in the future, or at least I will not add any without changing this. The other problem I see is that if someone decides to inherit from this class at some point this will likely break in a weird way

     


  • Discourse touched me in a no-no place

    @TGV said:

    @PJH said:
    I presume there are no virtual members in that class? Nor will there ever be.
    I thought that didn't matter. It will just overwrite all data members with 0, ...
    It will also overwrite the vtable (if present) with 0's breaking the virtual functions.

    which works in most cases.
    Only for integer types. It's not guaranteed to work for floating point or pointers for example (not that it matters given they've already invoked undefined behaviour by using memset() in such a fashion.)


  •  @TGV said:

    @PJH said:

    I presume there are no virtual members in that class? Nor will there ever be.
    I thought that didn't matter. It will just overwrite all data members with 0, which works in most cases. I'd say it is some kind of lazy anticipation that the class will grow, and the other line was added by someone else.

     

    Nope, just ran blame on it. The whole method was added to the repository by the same person in the same commit.

     



  • As a bonus, it will cause memory leaks if any of the members were pointers to anything that was new'd at any point during the lifetime of the class.



  • @WhiskeyJack said:

    As a bonus, it will cause memory leaks if any of the members were pointers to anything that was new'd at any point during the lifetime of the class.

    Thank Go... - erm, Stroustrup - for unique_ptr's.

    Oh... Wait... ;-)

    (Yes, yes, we're not to thank Stroustrup for that, for the pedantic)



  • That's pretty scary.  sizeof(SomeClass) may not return the expected value; meaning you could wipe out more memory or less memory that you were expecting.  C++ does not guarantee that the data for the object begins at the address of the object; the actual location of the data for the members is implementation dependent.  As others have pointed out, the vtable is not really data; and blindly wiping out things that are pointers is a bad thing. 

    You're also not giving the base class a chance to clean up; if the base class implements a reset() method you should be able to call that, and have your reset method just deal with the members in your class.

     A class should have very few members (i.e., less than 10) so you should be able to replace the memset with code that actually resets the individual members.  If the class has so many members that it is unclear what a reset() method should do to them, or creating a reset method that sets the members values individually is just too big a job, then you need to refactor the class.

    So witchdoctor, how hard would it be to provide a correct implementation of reset()?  Or is the size of this class another WTF?



  • @DrPepper said:

    That's pretty scary.  sizeof(SomeClass) may not return the expected value; meaning you could wipe out more memory or less memory that you were expecting.  C++ does not guarantee that the data for the object begins at the address of the object; the actual location of the data for the members is implementation dependent.  As others have pointed out, the vtable is not really data; and blindly wiping out things that are pointers is a bad thing. 

    You're also not giving the base class a chance to clean up; if the base class implements a reset() method you should be able to call that, and have your reset method just deal with the members in your class.

     A class should have very few members (i.e., less than 10) so you should be able to replace the memset with code that actually resets the individual members.  If the class has so many members that it is unclear what a reset() method should do to them, or creating a reset method that sets the members values individually is just too big a job, then you need to refactor the class.

    So witchdoctor, how hard would it be to provide a correct implementation of reset()?  Or is the size of this class another WTF?

     

     

    Yep, the size of this class is another WTF: I count 32 members, about 10 of them fixed-length arrays.

    Another WTF: I don't have commit rights on this because changes always go through the maintainer of each library via git pull. There are no unit tests and the maintainer is the only person who knows all the requirements so refactoring would be judged too risky at this point in time since we're approaching a release date. Fortunately the undefined behavior doesn't cause problems at the moment since this class has no base class and is not inherited by anything.

    By the way, this is software that is for use by a military among others, can't go into details beyond that it's at least not for a weapons system.

     



  • The code is well-defined behaviour if the class is POD:

    1. The class has no virtual methods and does not derive from anything with virtual methods

    2. All its data members are POD.

    If the class contains something like a std::string or a std::vector, your operation is invalid. (Nothing should happen on the call itself but subsequently you may experience undefined behaviour due to the modification you made to its members).

    It is still bad practice to do this. memset to 0 is fine for an array of integers or plain pointers. Otherwise just don't do it.

     

     



  • @Cbuttius said:

    It is still bad practice to do this. memset to 0 is fine for an array of integers or plain pointers. Otherwise just don't do it.

    Oh no, memset to 0, is not ok for pointers. It may work depending on your compiler and debug settings but it is not something you want to depend on. Or to be more exact: Using memset to set a pointer to all (int)0 will not always give you a null pointer.


  • Discourse touched me in a no-no place

    @Cbuttius said:

    The code is well-defined behaviour if the class is POD:

    1. The class has no virtual methods and does not derive from anything with virtual methods

    2. All its data members are POD.

    Using memset() to 'zero' floats or pointers is not well-defined.


  • @PJH said:

    Using memset() to 'zero' floats... is not well-defined.

    Really? Is this a platform specific thing? IEEE floats represent 0 as all 0 bits, don't they? When is memsetting it to 0 not OK? (Or do you mean in the case of non-IEEE compliant floats?)



  • @dcardani said:

    @PJH said:
    Using memset() to 'zero' floats... is not well-defined.

    Really? Is this a platform specific thing? IEEE floats represent 0 as all 0 bits, don't they? When is memsetting it to 0 not OK? (Or do you mean in the case of non-IEEE compliant floats?)

     

    Welcome to the weird world of C/C++ portability.  The language definitions refuse to define anything about certain behaviours in these languages.  Noteworthy among these undefined behaviours are:

    • a[i++] = i;  // not allowed because i is read in a way that is unrelated to its modification without an intervening sequence point.
    • memsetting a pointer to all-zeros (in particular, the standards do not impose a requirement that such a pointer is NULL, see e.g. AS/400)
    • use of uninitialised variables of any sort
    • dereferencing NULL - this is not guaranteed to crash, partly because some environments lack memory protection, and partly because that protection might not be enforced against reading from NULL e.g. AIX
    • memsetting floating point variables to all-zeros - in some floating point formats, the all-zeros pattern is not zero.  I haven't seen one of these formats - even the wonky hexadecimal float format used by System/360 used all-zeros to represent zero.
    • Calling free() with a non-NULL address that wasn't obtained directly from malloc/calloc/realloc.  strdup calls one or other of them, so it is OK.  Calling free() with an address that is already free is included in this undefined behaviour.
    • Calling operator delete with a non-NULL address that was not obtained by calling operator new.  The same applies to operator delete[] and operator new[].
    • Doing things with structs and classes that would be safe in pure C structs like memsetting them to zero when they have neither pointers nor floating point members, if they contain C++ features like member function, especially virtual ones, virtual base classes, etc.
    Of note: if the DeathStation 9000 were standard equipment in software houses, there would be a lot more nuclear craters, and a lot less undefined behaviour in shipping code.


  • Discourse touched me in a no-no place

    @dcardani said:

    @PJH said:
    Using memset() to 'zero' floats... is not well-defined.

    Really? Is this a platform specific thing? IEEE floats represent 0 as all 0 bits, don't they?

    Neither the C Standard or the C++ Standard mandate that IEEE 754 should be used to represent floating point numbers.



  • @Steve The Cynic said:

    • memsetting a pointer to all-zeros (in particular, the standards do not impose a requirement that such a pointer is NULL, see e.g. AS/400)

    I remember someone on this forum running into a problem with Nvidia's CUDA related to this, in which valid pointers could point to 0.


  • Discourse touched me in a no-no place

    @Husky said:

    @Steve The Cynic said:

    • memsetting a pointer to all-zeros (in particular, the standards do not impose a requirement that such a pointer is NULL, see e.g. AS/400)
    I remember someone on this forum running into a problem with Nvidia's CUDA related to this, in which valid pointers could point to 0.
    That's going to need a citation if it was C or C++. Link please.


  • @PJH said:

    @Husky said:
    @Steve The Cynic said:

    • memsetting a pointer to all-zeros (in particular, the standards do not impose a requirement that such a pointer is NULL, see e.g. AS/400)
    I remember someone on this forum running into a problem with Nvidia's CUDA related to this, in which valid pointers could point to 0.
    That's going to need a citation if it was C or C++. Link please.

    That was me as well, here's a link to the thread: http://forums.thedailywtf.com/forums/p/25369/274017.aspx


  • BINNED

    @Cbuttius said:

    The code is well-defined behaviour if the class is POD:

    1. The class has no virtual methods and does not derive from anything with virtual methods

    2. All its data members are POD.

    Is there some C++11 template magic to check if a class is POD?

    IIRC, Qt's QVector<T> can be optimized with some macro (don't remember the name) to tell it that T is a POD or memcpy-able. This would better be auto-detected or at least come with a static assert.

     



  • One may memset a pointer to zeros on a platform where this is a null pointer, and it is possible for platform-dependent isPOD() checks for pointers. Similarly with floats and doubles.

    GNU actually implements something like this in vector. When you create a C++ vector and resize it (rather than reserve) it must set all its members to the default state. This is done through a series of static checks to see if it can shortcut with memset. The headers it looks into include various platform-specific ones.

    It is suggested that if you do decide you want to memset pointers to 0, you do not do this generically all over your code, but ensure it goes through a platform-specific header, as you should do with anything non-portable.

    However in places I have worked I have been criticised in the past for doing anything like this with "wasting time / over-engineering, we're never going to run this on any platform other than Windows".

    There are loads of potential portability issues that can make certain things "undefined" in the standard but supported on specific platforms, and it is really left to the developers to weigh up how to handle such a situation. Assuming 8 bits per byte for example means you know that there are exactly 256 possible byte values. If you use unsigned char as an index to a static array of 256 elements, you therefore will always match an exact value without getting access violation.

    I don't know if there is any POD-traits in C++11, but there has been in boost for some time.

     



  • Jesus people. Wrong code is wrong code. What does it matter if there's some hypothetical platform in some hypothetical universe in which maybe the code kind of works a little bit even though it's wrong. That doesn't make the code any less wrong.



  • @blakeyrat said:

    Jesus people. Wrong code is wrong code. What does it matter if there's some hypothetical platform in some hypothetical universe in which maybe the code kind of works a little bit even though it's wrong. That doesn't make the code any less wrong.

    Actually, more commonly in the C and C++ world it's a hypothetical platform that does not conform that makes the standard "open".

    e.g. platforms that don't have 8 bits per byte.

    The POSIX library contains a "non-standard" cast: dlsym returns void * and you can cast it to a function pointer. That is non-standard. But dlsym isn't a C++ standard function, it's a function supported on POSIX systems for which the cast must work.

    If Windows and UNIX platforms allow memset on pointers to 0 to set them to null, then ideally write a "wrapper" function to set all your pointers to null that, on those platforms, will use memset, just like nothl() can be written as a no-op on some platforms and not on others.

    Remember, C an C++ are supposed to be "close to hardware" languages, particularly C.

     


  • Considered Harmful

    So, would this approach be OK if you were using the Pimpl idiom, passing the implementation struct pointer instead of this?



  • I'm still voting never ok. Never.


  • Considered Harmful

    @witchdoctor said:

    0x00

    TRWTF



  • @joe.edwards said:

    @witchdoctor said:
    0x00

    TRWTF

    If they only used 0x0, it would have only set half a byte at a time, which would be inefficient.



  • The actual danger of memsetting a pointer to all zero bytes would be if:

    1. You then compare the pointer to NULL and it returns false, or you perform a boolean check on it

     if( ptr ) { ... }

    2. if you called free() (or delete) on it, which is guaranteed to be no-op on a NULL pointer but that didn't work on your pointer.

    Dereferencing it or using it for pointer arithmetic is an invalid operation regardless.

     



  • @PJH said:

    Using memset() to 'zero' floats or pointers is not well-defined.
    Actually, I think in C/C++ it is not even defined for int.

     Because C/C++ hates to define something if it does not absolutely, totally needs to be.

     

    Of course any processor I know represents the value integer 0 with all bytes/bits set to 0.

    But there surely is a weird beast out there that does not.

     


  • Discourse touched me in a no-no place

    @Musaran said:

    @PJH said:
    Using memset() to 'zero' floats or pointers is not well-defined.
    Actually, I think in C/C++ it is not even defined for int.
    You think wrong; it is very well defined.





    Since you clearly can't be arsed to read The Standard, and presume to know what's in there without actually doing so, allow me to quote from it for you:
    @C99 draft said:
    6.2.6.2 Integer types



    #1



    For unsigned integer types other than unsigned char, the bits of the object representation shall be divided into two groups: value bits and padding bits (there need not be any of the latter). If there are N value bits, each bit shall represent a different power of 2 between 1 and 2N-1, so that objects of that type shall be capable of representing values from 0 to 2N-1 using a pure binary representation; this shall be known as the value representation. The values of any padding bits are unspecified.
    It then goes on to describe (in not so many words) two's complement, one's complement and sign-and-magnitude for signed integers.



  • @PJH said:

    @Musaran said:
    @PJH said:
    Using memset() to 'zero' floats or pointers is not well-defined.
    Actually, I think in C/C++ it is not even defined for int.
    You think wrong; it is very well defined.





    Since you clearly can't be arsed to read The Standard, and presume to know what's in there without actually doing so, allow me to quote from it for you:
    @C99 draft said:
    6.2.6.2 Integer types



    #1



    For unsigned integer types other than unsigned char, the bits of the object representation shall be divided into two groups: value bits and padding bits (there need not be any of the latter). If there are N value bits, each bit shall represent a different power of 2 between 1 and 2N-1, so that objects of that type shall be capable of representing values from 0 to 2N-1 using a pure binary representation; this shall be known as the value representation. The values of any padding bits are unspecified.
    It then goes on to describe (in not so many words) two's complement, one's complement and sign-and-magnitude for signed integers.

    I'm not saying you're not right, and I thought you were right without looking at the standard, but your quote actually makes me more hesitant:
    "The values of any padding bits are unspecified."

    What does "unspecified" mean in this context? Does it mean that the value may not matter, or does it mean the C++ standard doesn't mandate what it should be? In the latter case, any implementation may require all padding bits to be "1" for it to be a valid integer, in which case you couldn't memset.
    Also, he was talking about "int", not "unsigned int", which your quote refers to.


  • Discourse touched me in a no-no place

    @Evo said:

    "The values of any padding bits are unspecified."

    What does "unspecified" mean in this context? Does it mean that the value may not matter, or does it mean the C++ standard doesn't mandate what it should be? In the latter case, any implementation may require all padding bits to be "1" for it to be a valid integer, in which case you couldn't memset.

    Probably should have quoted more. The (omitted) footnote to the bit you quote read:

    @C99 said:

    Some combinations of padding bits might generate trap representations, for example, if one padding
    bit is a parity bit. Regardless, no arithmetic operation on valid values can generate a trap
    representation other than as part of an exception such as an overflow, and this cannot occur with
    unsigned types. All other combinations of padding bits are alternative object representations of the
    value specified by the value bits.

    @Evo said:

    Also, he was talking about "int", not "unsigned int", which your quote refers to.
    The only difference between the two is the presence of a sign bit, and how that bit should be interpreted:
    @C99 said:
    #2



    For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits; there shall be exactly one sign bit. Each bit that is a value bit shall have the same value as the same bit in the object representation of the corresponding unsigned type (if there are M value bits in the signed type and N in the unsigned type, then M<=N). If the sign bit is zero, it shall not affect the resulting value. If the sign bit is one, then the value shall be modified in one of the following ways:



    -- the corresponding value with sign bit 0 is negated;



    -- the sign bit has the value -2N;



    -- the sign bit has the value 1-2N.



  • @PJH said:

    You think wrong; it is very well defined.
    That leaves me wondering : Why is it so precisely defined ?

     

    C is used at pretty low level, close to hardware that might use other implementations.

     I can think of Gray code that does not use "increasing powers of 2 by position".

    Or some form of BCD where value 0 does not use "all bits to 0". 


  • Discourse touched me in a no-no place

    @Musaran said:

    @PJH said:

    You think wrong; it is very well defined.
    That leaves me wondering : Why is it so precisely defined ?

     

    C is used at pretty low level, close to hardware that might use other implementations.

    No idea. However most of the stuff that is proscribed by the Standard tends to be stuff that was common amongst most C compilers around the time the Standard was being written.

     I can think of Gray code that does not use "increasing powers of 2 by position".

    Or some form of BCD where value 0 does not use "all bits to 0". 

    Both your alternative examples of storing integers aren't the easiest/'best' (for whatever definition of 'best') way of doing so; Gray typically requires lookup tables e.g., BCD wastes bits. And neither is how a typical implementation of a processor will store them.

    There is, of course, nothing stopping you implementing integers in your DS9K in such a fashion, but you will also have to implement translation functions on some boundary akin to the current '0 in a pointer context' where 0 doesn't actually mean zero, and where (for example) the code snippet var & 0x10 must always return false for values between 0 and 8 and true for values between 8 and 15 regardless of what's in bit 4 in memory.

  • Considered Harmful

    @PJH said:

    No idea. However most of the stuff that is proscribed by the Standard tends to be stuff that was common amongst most C compilers around the time the Standard was being written.

    Did you mean prescribed?



  • @PJH said:

    Probably should have quoted more. The (omitted) footnote to the bit you quote read:

    @C99 said:

    Some combinations of padding bits might generate trap representations, for example, if one padding
    bit is a parity bit. Regardless, no arithmetic operation on valid values can generate a trap
    representation other than as part of an exception such as an overflow, and this cannot occur with
    unsigned types. All other combinations of padding bits are alternative object representations of the
    value specified by the value bits.

    Actually, imagine that one bit is an odd parity bit. Then the parity bit should be "1" if the integer itself is 0. If this is not the case, this quote seems to allow a trap. Note that is says "no arithmetic operation on valid values can generate a trap representation", but a representation with all 0's would not be a valid value. Or am I misinterpreting the quote here?


  • Discourse touched me in a no-no place

    @joe.edwards said:

    @PJH said:
    No idea. However most of the stuff that is proscribed by the Standard tends to be stuff that was common amongst most C compilers around the time the Standard was being written.

    Did you mean prescribed?

    The letters are right next to each other. That's my excuse, and I'm sticking to it.

  • Considered Harmful

    Assuming you actually type with that layout, how do you use e.g. ctrl+z, ctrl+x, ctrl+c, ctrl+v? They're scattered all over the place!

    That's the main reason why I ended up crawling back to QWERTY.


  • Discourse touched me in a no-no place

    @Evo said:

    Actually, imagine that one bit is an odd parity bit. Then the parity bit should be "1" if the integer itself is 0. If this is not the case, this quote seems to allow a trap. Note that is says "no arithmetic operation on valid values can generate a trap representation", but a representation with all 0's would not be a valid value. Or am I misinterpreting the quote here?
    Nope - you read it correctly. In fact that has been noted before, and a defect report was raised against The Standard for exactly this situation.



    The link shows the 'fix' applied to The Standard.





    (I was unaware of this until just now; I'm clearly using a too-old version of The Standard.)


  • Discourse touched me in a no-no place

    @joe.edwards said:

    Assuming you actually type with that layout, how do you use e.g. ctrl+z, ctrl+x, ctrl+c, ctrl+v? They're scattered all over the place!
    Various solutions (presuming you actually don't want to use the new positions): Use an editor that doesn't actually using those bindings to start with (vi), remap to what keys are present at the bottom left (;qjk), or use the alternative mappings (Ctrl/Shift+Insert/Delete).



  • @joe.edwards said:

    Assuming you actually type with that layout, how do you use e.g. ctrl+z, ctrl+x, ctrl+c, ctrl+v? They're scattered all over the place!

    That's the main reason why I ended up crawling back to QWERTY.

    I use:

    Cut: Shift+Delete

    Copy:  Ctrl+Insert

    Paste: Shift+Insert

    Using dvorak with programs that don't supprot those commands (or don't let you remap the keys) can be a pain and, but it is so much easier and faster to type with than qwerty ever was. 

    Some programs are smart enough to switch the letters automatically, however.

    And if all else fails you can simply change back to qwerty for the shortcuts then switch back to type (alt+shift or ctrl+shift should/can swich layouts/input languages).



  • @esoterik said:

    Using dvorak with programs that don't supprot those commands (or don't let you remap the keys) can be a pain and, but it is so much easier and faster to type with than qwerty ever was. 
     

    Oh, the riony!


Log in to reply