A field of mutually exclusive bits.



  • [b]Situation:[/b] Every Actor can be a passenger in at most one vehicle, and at most one seat in that vehicle.



    [b]How this was represented:[/b]



    class Actor

    {

    ...snip...

    private:

        union

        {

            unsigned int uVehicleFlags;

            struct

            {

                 unsigned int

                    bCarDriver : 1,

                    bCarFrontPassenger : 1,

                    bCarBackPassenger1 : 1,

                    bCarBackPassenger2 : 1,

                    bTruckDriver : 1,

                    bTruckFrontPassenger : 1,

                    bTruckBackPassenger1 : 1,

                    bTruckBackPassenger2 : 1,

                    bTruckBackPassenger3 : 1,

                    bTruckBackPassenger4 : 1,

                    bTruckBackPassenger5 : 1,

                    bTruckBackPassenger6 : 1,

    ...snip...

                    : 0; // some remaining bits

            }

        }

    ...snip...

    }



    That's right.



  • That's... special right there...



  • To be fair, I don't mind the bitfield-within-anonymous-struct-within-anonymous-union.



    I mean, yeah:

    1. it's not exactly standard
    2. if you serialise the uFlags value, you've gotta deal with the endianness across platforms
    3. the access modifiers on the individual bits will end up public with some compilers and private on others*
    4. it's just a bit whacky



      But:

      It is quite convenient, if you're happy to overcome the above issues.



      *Weirdly in visual studio, intellisense will treat the values as private, and flag up any "errors" where you try and access the individual bits outside the class, but Microsoft compilers will happily allow such access without any problem. That's a whole other wtf.

  • Considered Harmful

    @eViLegion said:

    1) it's not exactly standard

    2) if you serialise the uFlags value, you've gotta deal with the endianness across platforms

    3) the access modifiers on the individual bits will end up public with some compilers and private on others*

    4) it's just a bit whacky

    5) It allows invalid combinations of mutually exclusive flags.


    My C++ is extremely rusty so I might be mistaken.


  • Discourse touched me in a no-no place

    What about cars that can seat three people in the back that are still definitely not trucks? What about if someone's got a small bus that can seat 33 passengers? What if there's a child in the back who is sitting on someone else's knees?

    You need an unsigned long long (or whatever MSVC prefers to call it) to allow for future expansion…



  • @dkf said:

    What about cars that can seat three people in the back that are still definitely not trucks? What about if someone's got a small bus that can seat 33 passengers? What if there's a child in the back who is sitting on someone else's knees?
    What if none of your Actors has the CarDriver or TruckDriver bit set?  Does that mean the vehicle has no driver and is out of control?  There appears to be no easy way to check this rather significant state of affairs.



  • @da Doctah said:

    @dkf said:

    What about cars that can seat three people in the back that are still definitely not trucks? What about if someone's got a small bus that can seat 33 passengers? What if there's a child in the back who is sitting on someone else's knees?
    What if none of your Actors has the CarDriver or TruckDriver bit set?  Does that mean the vehicle has no driver and is out of control?


    Depends if it's a Prius in Nevada...


  • Discourse touched me in a no-no place

    @da Doctah said:

    There appears to be no easy way to check this rather significant state of affairs.
    Maybe there's a helper class to check that sort of thing. Or they farm that out to a web service over the ESB…



  • @dkf said:

    @da Doctah said:
    There appears to be no easy way to check this rather significant state of affairs.
    Maybe there's a helper class to check that sort of thing. Or they farm that out to a web service over the ESB…

    Eh? That's the easy bit.

    (!m_uVehicleFlags) would have meant the actor has no vehicle.



  • (&yomama->uVehicleFlags)[1] == 960



  • @eViLegion said:

    To be fair, I don't mind the bitfield-within-anonymous-struct-within-anonymous-union.

    I mean, yeah:
    1) it's not exactly standard
    2) if you serialise the uFlags value, you've gotta deal with the endianness across platforms
    3) the access modifiers on the individual bits will end up public with some compilers and private on others*
    4) it's just a bit whacky

    But:
    It is quite convenient, if you're happy to overcome the above issues.

    *Weirdly in visual studio, intellisense will treat the values as private, and flag up any "errors" where you try and access the individual bits outside the class, but Microsoft compilers will happily allow such access without any problem. That's a whole other wtf.
    Wut...

    Why would you use this instead of, say, a single enumerated field saying what position they were in?



  • @Sutherlands said:

    Wut...
    Why would you use this instead of, say, a single enumerated field saying what position they were in?

    Why indeed?



    Well, we use that bitfield-unioned-with-a-flag-int for decreasing the size of certain data structures, and it's fine, assuming it's done properly. In this case, the Actor had another such structure holding some other bools which weren't mutually exclusive, which happened to have enough space bits remaining to include two enums: VehicleType and VehicleSeat, so I changed it all to work that way.



    It doesn't strictly need the VehicleType variable, but it's convenient to avoid having to lookup the vehicle object and question its type directly.



    Of course... we still have loads of code calling functions like: bool Actor::IsCarDriver() const; bool Actor::IsTruckFrontPassenger() const;



    But they now rely on two underlying functions:

    VehicleType Actor::GetVehicleType() const;

    VehicleSeat Actor::GetVehicleSeat() const;

    ...as you would expect.



  • @Sutherlands said:

    Wut...

    Why would you use this instead of, say, a single enumerated field saying what position they were in?

    Because you want to get murdered.



  •  "A field of mutually exclusive bits"

     

    Seems like the first line of a poem... or a geek joke.



  • I shall call it "Ode to the behaviour of nested data structures."



  • Your game engine only allows one truck and one car?

    That game is going to suck.



  • @eViLegion said:

    and at most one seat in that vehicle.
    Fat people are about to sue you for discrimination.


  • Discourse touched me in a no-no place

    @eViLegion said:

    Eh? That's the easy bit.
    (!m_uVehicleFlags) would have meant the actor has no vehicle.
    No, that means that the actor is neither a driver nor a passenger in any vehicle. That's why it is better to ask an expert service somewhere. (As a bonus, it allows for appropriate treatment of actors who are flagged up in the database as being incorrigible backseat drivers.)

    The side-benefit of binding every build ever done to the service deployment is pure bonus.



  • TRWTF is that I had no idea bitfield syntax even existed in C.



  • @Faxmachinen said:

    TRWTF is that I had no idea bitfield syntax even existed in C.

    It's not commonly used, because of the endianness/serialization issues that make portability a bitch.



    It's also not that necessary any more, because if you make sure your class members are declared efficiently, most compilers will take care of packing stuff together:



    Eg, the following bits will probably be packed into a single byte with most optimised compilers.



    class Foo

    {

        bool m_b1;

        bool m_b2;

        bool m_b3;

        bool m_b4;

        bool m_b5;

        bool m_b6;

        bool m_b7;

        bool m_b8;

    };



    Although, these bools probably won't be:



    class Bar

    {

        bool m_b1;

        int m_someOutOfOrderShitThatFucksUpCompilerOptimisations1;

        bool m_b2;

        bool m_b3;

        bool m_b4;

        bool m_b5;

        int m_someOutOfOrderShitThatFucksUpCompilerOptimisations2;

        bool m_b6;

        bool m_b7;

        bool m_b8;

    };


  • Discourse touched me in a no-no place

    @eViLegion said:

    @Faxmachinen said:
    TRWTF is that I had no idea bitfield syntax even existed in C.


    class Foo

    {

        bool m_b1;

        bool m_b2;

        bool m_b3;

        bool m_b4;

        bool m_b5;

        bool m_b6;

        bool m_b7;

        bool m_b8;

    };
    That isn't C.



  • @PJH said:

    That isn't C.

    I never said it was. The OP (i.e. me) has been posting C++ all along, and someone else made the mistake of calling it C, which the OP (i.e. me) politely ignored.



    Either way, the same argument that compilers will probably pack bits within structs efficiently, so that bitfields aren't really necessary, applies in C just as in C++.


  • Discourse touched me in a no-no place

    @eViLegion said:

    Either way, the same argument that compilers will probably pack bits within structs efficiently, so that bitfields aren't really necessary, applies in C just as in C++.
    It really doesn't, because the only type C has that can be less than CHAR_BIT bits long is a bitfield.



  • I see... I was under the impression C had bools. I tend to assume C has a bunch of the fully basic shit that C++ has. That assumption is usually wrong.



    Jesus, C is such a shit language. I pity the fool that has to use it.



  • I have been reliably informed that C++ compilers won't actually pack anything to less than a byte, so 8 bools is 8 bytes. It seemed to me to be a reasonable assumption that private members on which you never call the address-of operator should be safely packable, but I guess not.

    So anyway, that makes the bitfields more useful.



  • @eViLegion said:

    C is such a shit language. I pity the fool that has to use it.
    I've been using it for about 30 years. Sometimes it's the best tool for the job. I wouldn't write applications in it, but for low-level hardware manipulation, C or C++ are about the only choices, and sometimes C wins.

    @eViLegion said:

    It's not commonly used, because of the endianness/serialization issues that make portability a bitch.
    Not just portability, functionality. The language makes no guarantees about how the bits get packed. I spend my life twiddling bits in hardware. The bit order absolutely must match the hardware, or the hardware won't do what (you think) you're telling it to do, so C bit fields (or relying on the C++ compiler to pack bools) are useless for this.

    Why C instead of C++? Probably a lot of the time it doesn't really matter, but there are a few reasons it might, mostly related to embedded processors. If the compiler for the processor you're using (you may not have a choice) supports C++, there probably is no good reason not to use it, although there may be very good reasons to avoid some C++ language features.



  • @PJH said:

    @eViLegion said:
    @Faxmachinen said:
    TRWTF is that I had no idea bitfield syntax even existed in C.


    class Foo

    {

        bool m_b1;

        bool m_b2;

        bool m_b3;

        bool m_b4;

        bool m_b5;

        bool m_b6;

        bool m_b7;

        bool m_b8;

    };
    That isn't C.

    The correct way to ensure bit-sized fields is not bools but

    
    struct Foo {
        unsigned int m_b1:1;
        unsigned int m_b2:1;
        unsigned int m_b3:1;
    }
    

    This still doesn't guarantee they'll be packed into a byte though. For that, compilers have pragmas/keywords. Apparently it does.


  • Considered Harmful

    It wasn't until I read The Design and Evolution of C++ that I really grew to spite the language.

    The author describes what a nightmare standard-by-committee really is, the stand offs and compromises that prevented new keywords from being introduced and led to silly things like the syntax for pure virtual methods being = 0.

    It's a good read though, and describes why such stupid seeming decisions were reached, and the concerns that drove them.


  • Discourse touched me in a no-no place

    @HardwareGeek said:

    Not just portability, functionality. The language makes no guarantees about how the bits get packed. I spend my life twiddling bits in hardware. The bit order absolutely must match the hardware, or the hardware won't do what (you think) you're telling it to do, so C bit fields (or relying on the C++ compiler to pack bools) are useless for this.

    Why C instead of C++? Probably a lot of the time it doesn't really matter, but there are a few reasons it might, mostly related to embedded processors. If the compiler for the processor you're using (you may not have a choice) supports C++, there probably is no good reason not to use it, although there may be very good reasons to avoid some C++ language features.

    Within a byte, there's no particular worry about which end is the LSB and which the MSB (get it wrong and nothing works, so that tends to get fixed quickly) so the (classic C) approach with doing the bit field manipulation through explicit ‘and’s and ‘or’s is pretty reliable. With packed bit fields, there's always the worry that some mad compiler developer will decide to put it at the other end of the byte. Just because.

    OTOH, alignment rules usually mean that the overall structure/class is aligned on an 8 byte boundary (maybe even more on a 64-bit system), and that individual fields within that structure are typically 4-byte aligned on a 32-bit system (with exceptions for a few things, like successive bitfields, chars or shorts). None of which is a gripe I have about either C or C++.



  • @eViLegion said:

    It's not commonly used, because of the endianness/serialization issues that make portability a bitch.

    It takes a special kind of idiot to figure that serialization should produce different results on different architectures.



  • @Faxmachinen said:

    @eViLegion said:

    It's not commonly used, because of the endianness/serialization issues that make portability a bitch.

    It takes a special kind of idiot to figure that serialization should produce different results on different architectures.

    Er. What are you saying, exactly?



    No-one has suggested that serialized data should end up different on different architectures.



  • @eViLegion said:

    No-one has suggested that serialized data should end up different on different architectures.

    Then I'm not sure what you mean by "endianness/serialization issues".

     



  • @Faxmachinen said:

    @eViLegion said:

    No-one has suggested that serialized data should end up different on different architectures.

    Then I'm not sure what you mean by "endianness/serialization issues".

     

    The issue is that depending on endianess, the bits are in a different order:



    union {

        u_int uFlags;

        struct {

            u_int bBit1 : 1;

            u_int bBit2 : 1;

    ... snip ...

            u_int bBit31 : 1;

            u_int bBit32 : 1;

        }

    }



    If by "should" you mean "human intent" then, yeah, I agree with you. By "should" I'm talking about "what the technology will do".



    Lets say you have code which sets bits 1 to 16 (inc.) to true, and bits 17 to 32 to false.

    On both endianness systems, bits 1 to 16 will be true, and 17 to 32 false.

    However, uFlags will be completely different values between those systems.




    On two systems with the same architecture, you can save yourself some hassle by simply serializing the value of uFlags, transmitting it from one system to the other, and deserializing at the other end. You flags will remain correct.



    On systems with different endianness, the transmitted value of uFlags will be completely misunderstood, so you have to take care to handle the order in which the individual bits are transmitted yourself, and reconstitute them at the other end.





    The fact that bitfield behaviour is not nicely defined in terms of significant-bit behaviour is a total wtf. Obviously they could have defined that the first bit in a bitfield is the most significant bit in terms of endianness (and that the last bit is the least significant bit)... (or the other way round, it doesn't matter hugely), then this wouldn't have been a portability problem .



    But, I suspect, that the bitfields are really designed more for tight control of hardware... e.g. where the setting a physical bit on a chip is what you're trying to do, because that bit controls a specific light, or opens a specific door, or something, and where the concept of endianness is wishy washy high level bullshit... and so it is left down to the developer to understand the differences in platform hardware, and to cope with the portability issues.


  • Discourse touched me in a no-no place

    @eViLegion said:

    union {

        u_int uFlags;

        struct {

            u_int bBit1 : 1;

            u_int bBit2 : 1;

    ... snip ...

            u_int bBit31 : 1;

            u_int bBit32 : 1;

        }

    }
    Of course writing to one member of that particular union, and reading from another member is undefined behaviour - both the C and C++ Standards tell you that.



  • Yeah, but its also bloody useful, so long as you're careful to deal with the potential pitfalls.



  • @PJH said:

    @eViLegion said:
    union {

        u_int uFlags;

        struct {

            u_int bBit1 : 1;

            u_int bBit2 : 1;

    ... snip ...

            u_int bBit31 : 1;

            u_int bBit32 : 1;

        }

    }
    Of course writing to one member of that particular union, and reading from another member is undefined behaviour - both the C and C++ Standards tell you that.

    What the fuck is the point of a specification that doesn't define ANYTHING*? If modifying a place in memory and reading it back is undefined, what is defined?

    *word chosen to annoy pedantic dickweeds.


  • Considered Harmful

    @Ben L. said:

    If modifying a place in memory and reading it back is undefined, what is defined?

    That's undefined.


  • Discourse touched me in a no-no place

    @Ben L. said:

    @PJH said:
    @eViLegion said:
    union {

        u_int uFlags;

        struct {

            u_int bBit1 : 1;

            u_int bBit2 : 1;

    ... snip ...

            u_int bBit31 : 1;

            u_int bBit32 : 1;

        }

    }
    Of course writing to one member of that particular union, and reading from another member is undefined behaviour - both the C and C++ Standards tell you that.

    What the fuck is the point of a specification that doesn't define ANYTHING*?

    Writing to uFlags and reading back from uFlags is well defined. As is writing to bBit1 and reading back from bBit1. What isn't well defined is writing to bBit1 and expecting uFlags to be a certain value. How is this difficult to understand?
    If modifying a place in memory and reading it back is undefined, what is defined?
    Your assumption appears to be predicated on relying that uFlags and bBit1..32 overlap completely. They need not. For starters uFlags need not be 32 bits long; presuming u_int to be a typedef of unsigned int, it could be as small as 16 bits.



  • There's absolutely no reason to use unions*, and anyone who does so should be shot. In the bitfield case, I can't imagine any good reason to pass around an int rather than the bitfield struct itself.

    *Unless you work with embedded devices, obviously.



  • @Faxmachinen said:

    There's absolutely no reason to use unions*, and anyone who does so should be shot. In the bitfield case, I can't imagine any good reason to pass around an int rather than the bitfield struct itself.

    *Unless you work with embedded devices, obviously.

    Your imagination needs an upgrade.



    If you have tight restrictions on memory, unions are incredibly useful for container classes that may contain values of different types. This can easily happen in situations outside of embedded systems.



    Additionally, if you are regularly copy-constructing an object which contains a lot of booleans, it is considerably more efficient to put those booleans into a bitfield unioned with some integer, and simply copying the integer rather than the individual bits.



    In exactly the same vein, if you are populating some message object, which will transmit information about the state of some other object with lots of bools (as above), you can populate that message much faster by copying the unioned integer, rather than the individual booleans. This, however, generally requires the platform (or at the very least endianness) to be the same.





    I've provided some solid cases for why there ARE absolutely good reason's to use unions. Would you like to explain why you think there are none, rather than simply stating it without any form of justification whatsoever (with the exception of "my mind hasn't managed to think of any yet")?


  • Considered Harmful

    @eViLegion said:

    I've provided some solid cases for why there ARE absolutely good reason's to use unions. Would you like to explain why you think there are none, rather than simply stating it without any form of justification whatsoever (with the exception of "my mind hasn't managed to think of any yet")?

    I think he meant it in a Unions Considered Harmful kind of way, rather than in a Unions Are Never Useful way. There are perhaps good and valid reasons to use things like goto and eval, but in general they cause more problems than they solve (you yourself highlighted the endianness problem which seems big enough to eliminate unions as a good choice for serialization) and there are other ways to accomplish the same thing.

    For example, if you need a bitfield, there are bitwise operators available for those, and if you must expose the bitfield as booleans, you can use getter/setter methods for that.



  • @joe.edwards said:

    I think he meant it in a Unions Considered Harmful kind of way, rather than in a Unions Are Never Useful way. There are perhaps good and valid reasons to use things like goto and eval, but in general they cause more problems than they solve (you yourself highlighted the endianness problem which seems big enough to eliminate unions as a good choice for serialization) and there are other ways to accomplish the same thing.

    For example, if you need a bitfield, there are bitwise operators available for those, and if you must expose the bitfield as booleans, you can use getter/setter methods for that.

    Fair enough.



    But the way he phrased what he said is the equivalent of saying "Heavy machinery has no purpose, and anyone who ever uses a big digger should be strung up", when what he actually means is "Heavy machinery is large, difficult to operate, and potentially extremely dangerous; it should not be attempted by unlicensed individuals".



    The first thing is obviously bollocks. And the second is simply saying "Don't fuck around with stuff if you don't know what you're doing" with the unmentioned addendum of "But if you do know what you're doing, then fine, you can trust yourself not to fuck this up"... which is generally good advice for everything in life.



  • @joe.edwards said:

    There are perhaps good and valid reasons to use things like goto and eval, but in general they cause more problems than they solve
    You know, I have yet to see an honest example of a goto being harmful. Although that's probably because most people avoid them like the plague because of their bad reputation. That and, well, the other control flow keywords steal much of their spotlight.


  • Considered Harmful

    @Zecc said:

    @joe.edwards said:

    There are perhaps good and valid reasons to use things like goto and eval, but in general they cause more problems than they solve
    You know, I have yet to see an honest example of a goto being harmful. Although that's probably because most people avoid them like the plague because of their bad reputation. That and, well, the other control flow keywords steal much of their spotlight.

    The only time in my career I can remember needing a goto was for the "Retry" option from a Cancel/Retry/Ignore type 3-way branch, and I'm not convinced I really needed it then (but it was less convoluted than the alternatives).



  • @joe.edwards said:

    @Zecc said:

    @joe.edwards said:

    There are perhaps good and valid reasons to use things like goto and eval, but in general they cause more problems than they solve
    You know, I have yet to see an honest example of a goto being harmful. Although that's probably because most people avoid them like the plague because of their bad reputation. That and, well, the other control flow keywords steal much of their spotlight.

    The only time in my career I can remember needing a goto was for the "Retry" option from a Cancel/Retry/Ignore type 3-way branch, and I'm not convinced I really needed it then (but it was less convoluted than the alternatives).

    Goto can sometimes be the simplest and clearest way of breaking out of multiple nested loops, in a languages which don't have java-style named loops.



    Your other main options are:



    Having some bloody 'result' or 'continue' boolean that you set, and have to keep checking throughout such loops.

    Shoving the loops into some function, and returning from it.



    Both of which make the code more complicated, and less readable.



    That having been said, I will avoid using such gotos, because I can't be arsed to have to explain to someone else why their "don't use gotos ever" religion is incorrect.



  • What eViLegion said about breaking out of loops.

    Also in some rares cases it's more readable than the alternatives when you want to cut some processing short but would rather avoid the alternatives: wrting a bunch of nested ifs, moving the remaining code to single-use functions or (gah!) using a try-finally block with a throw in the middle.

    I've been guilty of using the following pattern before, but I've grown out of it and will now use a goto:

     

    do {
        // do stuff
    
    if (allSetHere) continue;
    
    // optional stuff
    
    if (justGoAheadAlready) continue;
    
    // more optional stuff
    

    } while(false); // psyche!

    // do unavoidable stuff

     

    But to be clear: I'm not saying gotos are Da Shit and you should totally use them more. I'm just saying they've got a worse reputation than they deserve. I think we can agree to that.

    Having said that, gotos would be a heck of a lot better if they were limited to function scope. Edit: never mind. They are. I was thinking of setjmp.



  • To be honest, while the do-while-false-continue pattern (also known as the while-true-break pattern) is worse than goto, they're both indicative that you should probably be using exceptions. Although, then you trade goto-is-bad nazis with exceptions-for-flow-control nazis.

    If your language doesn't have exceptions (I'm looking at you, Go), then the only way is to have a "do X -> did X fail? -> early return with error code" block for each X. Atleast C lets you use preprocessor macros for this.


  • Considered Harmful

    I propose an IDE where language features have to be unlocked by proving you know where to use them and where to not use them.

    Gotos for control flow are messy. Exceptions for flow control are just plain wrong.



  • @joe.edwards said:

    I propose an IDE where language features have to be unlocked by proving you know where to use them and where to not use them.

    Gotos for control flow are messy. Exceptions for flow control are just plain wrong.

    MMO-IDE-RPG?



  • @Arnavion said:

    If your language doesn't have exceptions (I'm looking at you, Go), then the only way is to have a "do X -> did X fail? -> early return with error code" block for each X.

    Not exactly.


Log in to reply