Union struct undefined behaviour in C++?



  • I have a piece of code that is roughly like this:

    struct foo
    {
      int number;
      union
      {
        helper_class<baz_struct> baz;
        helper_class<qux_struct> qux;
      }
    }
    
    ...
    foo variable1 = bla bla bla;
    ...
    foo variable2 = get_something();
    ...
    variable1 = variable2;
    std::cout << variable1.baz().member << std::endl;
    

    Now, depending on the exact context, the value displayed is the old value of variable1. Disassembly then shows:

     std::cout << variable2.baz().member << std::endl;
    8444168:     8b 9d 78 fa ff ff       mov    -0x588(%ebp),%ebx
     variable2 = variable1;
    844416e:     b9 75 00 00 00          mov    $0x75,%ecx
    8444173:     8d bd 6c fa ff ff       lea    -0x594(%ebp),%edi
    8444179:     8d b5 40 fc ff ff       lea    -0x3c0(%ebp),%esi
    844417f:     f3 a5                   rep movsl %ds:(%esi),%es:(%edi)
    

    It appears as if the optimizer switched the two statements (although I don't really understand how only a MOV instruction can represent a console output call).

    Anyone here who has an idea what kind of undefined behaviour could cause the trouble here?


  • Banned

    @Grunnen Are you sure get_something() isn't bugged? If it is using union incorrectly, effects might propagate upwards.

    First thing I'd do is run CppCheck. It's great at finding UB and has nice GUI.



  • This post is deleted!


  • @Grunnen Have you tried compiling without optimizations?



  • I can't see any UB in the code you posted.* And it'd have to be pretty weird UB to cause the compiler to assume that the assignment doesn't have side effects that affect the following line.

    I think the disassembly may be misleading, since it's possible that the assembly instructions for the two lines are simply interleaved, and that your bug is elsewhere.

    (*Even if you're incorrectly using the union for type punning, which is UB, you shouldn't see the behavior you're describing.)

    Edit: If you look at the surrounding assembly, is it possible that the compiler is not generating any code for the assignment and using the right hand side of your assignment directly in the following function calls? That'd explain why the assignment is displayed after the following line in the disassembly.



  • Thank you all for your thoughts

    @dfdub Well, one thing I have tried, is replacing the assignment by a memcpy call. Then the code works as it should. I've disassembled the two binaries and diff'ed the result, and really the only difference is this:

       memcpy(&variable2, &variable1, sizeof(foo));
      8444167:     8d bd 6c fa ff ff       lea    -0x594(%ebp),%edi
      844416d:     b9 75 00 00 00          mov    $0x75,%ecx
      8444172:     8d b5 40 fc ff ff       lea    -0x3c0(%ebp),%esi
      8444178:     5b                      pop    %ebx
      8444179:     f3 a5                   rep movsl %ds:(%esi),%es:(%edi)
       std::cout << variable2.baz().member << std::endl;
      844417b:     8b 9d 78 fa ff ff       mov    -0x588(%ebp),%ebx
    

    So it really seems to be the reordering of the MOV instruction that messes things up.

    Now my suspicion is the helper_class, because deeply buried inside it uses custom copy and assigment constructors, which should prevent the struct foo from compiling at all. Who knows what its trickery could cause thet optimizer to think?



  • Well, if I remove the helper_class around baz_struct and thus replace variable1.baz().member = value with variable1.baz.member = value, then the optimizer doesn't reorder the instructions and everything works properly. So it does seem as if the trickery of helper_class let the optimizer lose track of whether something meaningful was assigned.


  • BINNED

    @Grunnen said in Union struct undefined behaviour in C++?:

    Now my suspicion is the helper_class, because deeply buried inside it uses custom copy and assigment constructors, which should prevent the struct foo from compiling at all

    What does the helper_class do in the first place?

    Also from cppreference:

    Unions cannot contain a non-static data member with a non-trivial special member function (copy constructor, copy-assignment operator, or destructor).
    (until C++11)

    If a union contains a non-static data member with a non-trivial special member function (copy/move constructor, copy/move assignment, or destructor), that function is deleted by default in the union and needs to be defined explicitly by the programmer.

    If a union contains a non-static data member with a non-trivial default constructor, the default constructor of the union is deleted by default unless a variant member of the union has a default member initializer .

    At most one variant member can have a default member initializer.
    (since C++11)



  • @topspin The helper class uses some tricks to remove default member initializers etc., and it is derived from a class which seems to have its own copy and assignment constructors etc. So based upon the notes you pasted, I think the code should not be able to compile at all, if I understand everything correctly. Yet it does compile somehow. I don't entirely have a happy feeling with this.



  • @Grunnen said in Union struct undefined behaviour in C++?:

    So based upon the notes you pasted, I think the code should not be able to compile at all

    Well, before C++11. After that, the text says that your code will compile, but probably not do what you expect.

    In any case, you should stop using that helper inside the union and move that logic to the struct which contains the union.


  • BINNED

    @dfdub or not use magic stuff in unions. Use a variant if you’re not low level.



  • @dfdub said in Union struct undefined behaviour in C++?:

    After that, the text says that your code will compile, but probably not do what you expect.

    I might have to take that back. I looked up the rules for implicitly-declared copy assignment operators again, and they explicitly state:

    A defaulted copy assignment operator for class T is defined as deleted if any of the following is true:

    […]
    T is a union-like class, and has a variant member whose corresponding assignment operator is non-trivial.

    If your helper class defines non-trivial copy assignment operators, the code should indeed not compile in the first place, because the struct's copy assignment operator should be deleted.

    The practical advice remains the same, but you may actually have run into a compiler bug.


  • Discourse touched me in a no-no place

    @Grunnen said in Union struct undefined behaviour in C++?:

      union
      {
        helper_class<baz_struct> baz;
        helper_class<qux_struct> qux;
      }
    

    Putting two non-trivial objects into the same memory is extremely unlikely to be happy. Why does the code do this?



  • This post is deleted!


  • @dkf Oh I can tell now why it does like this. In an anonymized way, so don't pin me on the details!

    The software runs of Linux, and until like 10 years ago it communicated through TCP/IP with a system written in VB6. The struct is communicated as raw bytes across the link and had a matching VB6 union structure on the other side. Now VB6 has its own representation of strings, so on Linux/C++ they created their own class that has a binary compatible representation of strings. However, this class does have clever constructors etc., so you can't put it in a C++ union, according to the standard. Hence the helper_class<> on each union member, that does some magic to let everything work.

    Nowadays the program from the other side has been rewritten into C++ as well, but they just used the existing header files for the communication routines, as that was the quickest/cheapest way, and as long as it worked...


  • Discourse touched me in a no-no place

    @Grunnen said in Union struct undefined behaviour in C++?:

    However, this class does have clever constructors etc., so you can't put it in a C++ union, according to the standard.

    The important thing is actually whether there is a vtable pointer. If there isn't (and isn't for any of its members, of course) it will actually work in a union. There are definitely times when I wish the C++ committee said that class is allowed to reorder elements, and struct is not (and may not have a vtable; any cleverness has to be static cleverness). Then it would be crystal clear that on-the-wire things would be done with struct and clever clever stuff would be done with class and everyone would be happier.



  • @dkf Well, the class is derived from base classes, but no methods are overridden. So, on the one hand, the vtable is almost certainly not used. On the other hand, according to someone on StackOverflow it's in theory still undefined whether the vtable is used or not.

    Second, I found out that the helper class doesn't actually have any forbidden non-trivial constructors and/or operators. However, it does have funny tricks like this:

    T& helper_class<T>::operator=(const T& other)
    {
      return *(T*)this = other;
    }
    

    (Btw., note the type of the return value)

    Could it be that such tricks trigger undefined behaviour and/or confuse the optimizer?


  • Discourse touched me in a no-no place

    @Grunnen said in Union struct undefined behaviour in C++?:

    Could it be that such tricks trigger undefined behaviour and/or confuse the optimizer?

    The optimiser doesn't really see the classes. It looks at things at a much lower level.

    But that code sure makes my brain want to go and hide.


  • Banned

    @Grunnen said in Union struct undefined behaviour in C++?:

    @dkf Well, the class is derived from base classes, but no methods are overridden. So, on the one hand, the vtable is almost certainly not used.

    Is there a virtual destructor though? It would be weird if there wasn't.

    @Grunnen said in Union struct undefined behaviour in C++?:

    Could it be that such tricks trigger undefined behaviour and/or confuse the optimizer?

    That cast does seem to break strict aliasing, which technically makes it UB but I'm not sure how much it matters. But you say it's not the only "clever" thing done, and that really worries me.


  • BINNED

    @Gąska said in Union struct undefined behaviour in C++?:

    @Grunnen said in Union struct undefined behaviour in C++?:

    @dkf Well, the class is derived from base classes, but no methods are overridden. So, on the one hand, the vtable is almost certainly not used.

    Is there a virtual destructor though? It would be weird if there wasn't.

    Doubt that from what I assume the helper class does. I assume it just adds non-virtual methods to T and no members.

    @Grunnen said in Union struct undefined behaviour in C++?:

    Could it be that such tricks trigger undefined behaviour and/or confuse the optimizer?

    That cast does seem to break strict aliasing, which technically makes it UB but I'm not sure how much it matters. But you say it's not the only "clever" thing done, and that really worries me.

    The cast seems fine assuming that helper_class<T> is derived from T. Should probably be written as this->T::operator=(other) though (no idea if you need to write that as template T without trying it).



  • @topspin said in Union struct undefined behaviour in C++?:

    assuming that helper_class<T> is derived from T

    But it's not. T could be anything, even a basic type like char, I think.


  • Discourse touched me in a no-no place

    @Gąska said in Union struct undefined behaviour in C++?:

    Is there a virtual destructor though? It would be weird if there wasn't.

    Anything virtual and you should assume you get a vtable. They are vital if you ever do inheritance from the class concerned (and intend to take advantage of the LSP). However, for a class that you're never ever supposed to inherit from, there's no real problem with having everything be concrete. Dropping the vtable allows you to control what the memory layout of the class is, and that's kinda vital for some application areas (such as this one).

    But I still don't want to write this stuff. I like binding layers to be written in stuff with greater predictability.


  • BINNED

    @Grunnen said in Union struct undefined behaviour in C++?:

    @topspin said in Union struct undefined behaviour in C++?:

    assuming that helper_class<T> is derived from T

    But it's not. T could be anything, even a basic type like char, I think.

    Then something layout compatible. Is its first (and hopefully only) member of type T?
    This should be valid if and only if helper_class<T> is layout compatible with T.


  • Banned

    @dkf said in Union struct undefined behaviour in C++?:

    However, for a class that you're never ever supposed to inherit from, there's no real problem with having everything be concrete.

    Yeah, but @Grunnen said that there is inheritance involved - and if you have inheritance without virtual destructors, that's opening yet another can of worms.

    @topspin said in Union struct undefined behaviour in C++?:

    @Grunnen said in Union struct undefined behaviour in C++?:

    @topspin said in Union struct undefined behaviour in C++?:

    assuming that helper_class<T> is derived from T

    But it's not. T could be anything, even a basic type like char, I think.

    Then something layout compatible. Is its first (and hopefully only) member of type T?
    This should be valid if and only if helper_class<T> is layout compatible with T.

    It doesn't matter if it's layout-compatible or not. Strict aliasing forbids casting pointers to unrelated types, period.


  • BINNED

    @Gąska said in Union struct undefined behaviour in C++?:

    But it's not. T could be anything, even a basic type like char, I think.

    Then something layout compatible. Is its first (and hopefully only) member of type T?
    This should be valid if and only if helper_class<T> is layout compatible with T.

    It doesn't matter if it's layout-compatible or not. Strict aliasing forbids casting pointers to unrelated types, period.

    But are they unrelated?

    26

    If a standard-layout class object has any non-static data members, its address is the same as the address of its first non-static data member if that member is not a bit-field.
    Its address is also the same as the address of each of its base class subobjects.
    [Note: There might therefore be unnamed padding within a standard-layout struct object inserted by an implementation, but not at its beginning, as necessary to achieve appropriate alignment.
    — end note
    ]
    [Note: The object and its first subobject are pointer-interconvertible ([basic.compound], [expr.static.cast]).
    — end note
    ]

    4

    Two objects a and b are pointer-interconvertible if:

    (4.1)
    they are the same object, or
    (4.2)
    one is a union object and the other is a non-static data member of that object ([class.union]), or
    (4.3)
    one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no non-static data members, any base class subobject of that object ([class.mem]), or
    (4.4)
    there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible.

    If two objects are pointer-interconvertible, then they have the same address, and it is possible to obtain a pointer to one from a pointer to the other via a reinterpret_­cast.
    [Note: An array object and its first element are not pointer-interconvertible, even though they have the same address.
    — end note
    ]

    So they are pointer-interconvertible. Whatever the heck that means.


  • Banned

    @topspin I did some googling and the general consensus among C++ experts seems to be "fuck if I know, but it feels like pointer-interconvertible ought to mean it's allowed to alias".

    I wonder how pointer-interconvertibility plays with alignment.


  • BINNED

    @Gąska said in Union struct undefined behaviour in C++?:

    @topspin I did some googling and the general consensus among C++ experts seems to be "fuck if I know, but it feels like pointer-interconvertible ought to mean it's allowed to alias".

    That’s my feeling too. I started out pretty sure that this should work (even the C people use that for wish-it-was-OOP), but this being C++, writing that reply I toned it down quite a bit to “well, who knows.”



  • @Gąska @topspin
    I think it's allowed and not UB. You just have to read the strict aliasing rule carefully:

    Whenever an attempt is made to read or modify the stored value of an object of type DynamicType through a glvalue of type AliasedType, the behavior is undefined unless one of the following is true:

    If you convert to a pointer to the first struct member, you're no longer trying to modify the same object, so the strict aliasing rule doesn't apply.

    Edit: The more I think about, the more I'm convinced my interpretation is correct. Just think of the reason the strict aliasing rule exists in the first place: The compiler wants to be able to make the assumption that certain pointers cannot refer to the same memory location and optimize accordingly. But if the pointers refer to a struct and a type that is present inside that struct, it can never make that assumption anyway, so it won't do anything stupid.


  • Java Dev

    @dfdub said in Union struct undefined behaviour in C++?:

    certain pointers cannot refer to the same memory location

    Specifically, pointers of different types cannot refer to the same memory location. So if 2 variables have different types, and both are read from/written to, then it can reorder those since those reads and writes do not refer to the same memory location.



  • @PleegWat said in Union struct undefined behaviour in C++?:

    Specifically, pointers of different types cannot refer to the same memory location.

    There are a lot of exceptions to that, but them having different types is definitely a necessary precondition.


  • Java Dev

    @dfdub said in Union struct undefined behaviour in C++?:

    @PleegWat said in Union struct undefined behaviour in C++?:

    Specifically, pointers of different types cannot refer to the same memory location.

    There are a lot of exceptions to that, but them having different types is definitely a necessary precondition.

    And I tend to only think about things from a C perspective; C++ is different.



  • @PleegWat said in Union struct undefined behaviour in C++?:

    And I tend to only think about things from a C perspective; C++ is different.

    Even a C compiler wouldn't assume that a pointer of a struct type and a pointer of the struct's first member's type refer to different memory locations. Or a char* and any other pointer. Or an unsigned int* and an int*. :pendant:



  • @topspin First of all, helper_class does seem to be a 'standard layout class', so that the first member is guaranteed to be at the same address as the object itself.

    But, the first member of helper_class is an union, of which one of the members is uint8_t mData[sizeof(T)] (why???), so then rule 4.3 about pointer-interconvertible-ness is not fulfilled.

    And then, if I understand it correctly, the problematic code more or less amounts to this:

    uint32_t variable1 = value;
    helper_class<uint32_t> helper1;
    *(uint32_t*)(&helper1) = variable1;
    helper_class<uint32_t> helper2 = helper1;
    std::cout << *(uint32_t*)(&helper2) << std::endl;
    

    Because helper_class<uint32_t> is not pointer-interconvertible with uint32_t, the compiler/optimizer is allowed to rearrange things and trouble happens.

    Right?



  • Oh and thanks all, I really learn something here!


  • Discourse touched me in a no-no place

    @Grunnen said in Union struct undefined behaviour in C++?:

    Right?

    That sounds like a Stack Overflow language-lawyer question. And fuck if I know. I know how to make these sorts of things work in C, not C++.



  • @dkf Haha, sorry, it’s my weak point. I was once in an oral exam and the professor asked “Can you repeat that answer without a question mark at the end”


  • BINNED

    @Grunnen said in Union struct undefined behaviour in C++?:

    But, the first member of helper_class is an union, of which one of the members is uint8_t mData[sizeof(T)] (why???)

    Is one of the union members at least a T? :sideways_owl:



  • @topspin From the top of my head: nope


  • Banned

    @dfdub said in Union struct undefined behaviour in C++?:

    Edit: The more I think about, the more I'm convinced my interpretation is correct. Just think of the reason the strict aliasing rule exists in the first place: The compiler wants to be able to make the assumption that certain pointers cannot refer to the same memory location and optimize accordingly. But if the pointers refer to a struct and a type that is present inside that struct, it can never make that assumption anyway, so it won't do anything stupid.

    That could be right, except...

    [Note: An array object and its first element are not pointer-interconvertible, even though they have the same address. — end note]

    ...how do you explain this?


  • Banned

    @dfdub said in Union struct undefined behaviour in C++?:

    @PleegWat said in Union struct undefined behaviour in C++?:

    And I tend to only think about things from a C perspective; C++ is different.

    Even a C compiler wouldn't assume that a pointer of a struct type and a pointer of the struct's first member's type refer to different memory locations.

    Yes, it wouldn't make sense otherwise.

    Or a char* and any other pointer.

    Yes, but for a different reason.

    Or an unsigned int* and an int*. :pendant:

    Here, however, I'm 70% sure you're wrong. Which is the highest I'm ever sure of anything related to C or C++.


  • BINNED

    @Grunnen said in Union struct undefined behaviour in C++?:

    @topspin From the top of my head: nope

    Then might I tentatively suggest you throw this whole mess away? 🍹



  • @Gąska said in Union struct undefined behaviour in C++?:

    That could be right, except...

    [Note: An array object and its first element are not pointer-interconvertible, even though they have the same address. — end note]

    ...how do you explain this?

    I think pointer interconvertibility is supposed to be symmetric. If you could get a valid pointer to an array of the same type from any pointer, that'd be surprising and potentially prevent optimizations based on alias analysis.

    @Gąska said in Union struct undefined behaviour in C++?:

    Or an unsigned int* and an int*.

    Here, however, I'm 70% sure you're wrong. Which is the highest I'm ever sure of with anything related to C or C++.


  • Considered Harmful

    Union struct undefined behavior in C++?

    Yes.


  • Banned

    @dfdub said in Union struct undefined behaviour in C++?:

    @Gąska said in Union struct undefined behaviour in C++?:

    That could be right, except...

    [Note: An array object and its first element are not pointer-interconvertible, even though they have the same address. — end note]

    ...how do you explain this?

    I think pointer interconvertibility is supposed to be symmetric. If you could get a valid pointer to an array of the same type from any pointer, that'd be surprising and potentially prevent optimizations based on alias analysis.

    Still, the lack of symmetry between structs and arrays is jarring - it's perfectly fine to cast a struct to its first element, but it's UB to cast an array to its first element. And it also conflicts with the fact that arrays are actually pointers 99.99% of the time.

    @Gąska said in Union struct undefined behaviour in C++?:

    Or an unsigned int* and an int*.

    Here, however, I'm 70% sure you're wrong. Which is the highest I'm ever sure of with anything related to C or C++.

    Your link helpfully points to a tweet that shows it's indeed rule violation (which does NOT mean it doesn't work in practice - see: signed overflow).



  • @Gąska said in Union struct undefined behaviour in C++?:

    Your link helpfully points to a tweet that shows it's indeed rule violation (which does NOT mean it doesn't work in practice - see: signed overflow).

    Wait, I was talking about strict aliasing rules there, not allowed casts. That's a different topic.


  • Banned

    @dfdub strict aliasing is part of allowed casts rules. AFAIK int and unsigned int are sufficiently different to not be allowed under strict aliasing.



  • @Gąska The strict aliasing rules are referenced in the casting rules for reinterpret_cast, but conceptually, they're not the same thing. For example, you cannot necessarily modify or access an object through the pointer obtained by a valid cast just because the cast is allowed - which means you can cast to a pointer that doesn't alias the original one.

    My first source quotes the C11 and C++17 standards.

    C11:

    An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
    […]
    — a type that is the signed or unsigned type corresponding to the effective type of the object,

    C++17:

    If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:
    […]
    (11.4) — a type that is the signed or unsigned type corresponding to the dynamic type of the object,

    Don't ask me why int* and unsigned int* alias (I'd like to know myself!), but according to the standard, they do.



  • @topspin I've tried adding a member T, but then it exposes e.g. the non-trivial constructors of T, so that you can't stick it into a union, so it defeats the whole purpose of using the helper_class in the first place.

    Oh, I've also tried recompiling the thing with -fno-strict-aliasing, which also 'solves' the problem. So it does indeed seem to be a problem of breaking the strict aliasing rules.

    So in the end, yes, I've decided to throw the whole thing away and do it properly.


  • Banned

    @dfdub said in Union struct undefined behaviour in C++?:

    Don't ask me why int* and unsigned int* alias (I'd like to know myself!), but according to the standard, they do.

    And according to compilers they don't. Go figure :mlp_shrug:


  • Banned

    @Grunnen said in Union struct undefined behaviour in C++?:

    So in the end, yes, I've decided to throw the whole thing away and do it properly.

    👍


Log in to reply