1 << undefined_behaviour



  • Found in some legacy cruft in our system when investigating why a particular value was coming out as negative (this code wasn't the problem, the problem was this function getting called on the value when it should have been a different function). I have it on reasonable authority that this code is running on a system (not a flight-critical one) on planes. Retyped from memory, probably some mistakes inserted:

    double func(double scale, long value, int bits) {
        long shift = 1 + std::numeric_limits<long>::digits - bits;
        value = (value << shift) >> shift;
        
        return value * scale;
    }
    

    Explanation for those of you who are Blakey or Haskell programmers: This function takes a signed integer, extracts a signed twos-complement integer 'bits' long, including sign bit, from the right-hand-side of the integer, multiplies it by a scaling factor, and then returns it. So far so good, this is a reasonable thing to need to do in this context.

    Problem: Left-shifting a signed integer in C++ is undefined behaviour if you shift a '1' off the left edge, and this code specifically does that in order to make the return value negative (because all signed integers are twos-complement, right?). Right-shifting signed integers is merely implementation-defined, and fortunately our implementation sign-extends, as is sensible.

    I think, but am not certain, that our compiler would have been within its rights to compile value = (value << shift) >> shift to a no-op. It absolutely would have been within its rights to make everything explode the instant something was shifted into the top bit. We are fortunate that it has decided to be nice to us for the last decadeish and appeared to work. Sure looking forward to the rest of the surprises in all this legacy crap...



  • What compiler and version of said compiler is this code running under, if that's not private info?



  • Visual Studio 2013, Visual Studio 2010. Don't know what service packs.

    EDIT: Presumably the plane version was compiled with something else; I don't know.



  • @jmp said:

    Presumably the plane version was compiled with something else

    Don't assume that. When I have worked in places that dealt with aircraft systems, all code had to go via the certified tool chain. Once a toolchain was sorted, it didn't change without major effort. We even had a VAX running so we could build some code for custom hardware that was originally coded using a compiler that ran on VAX



  • Our use of the code is somewhat separate from the original use - different company, for starters. I'm confident that the real-time hardware that was originally running the comms stuff wasn't running Windows. 😛.


  • Banned

    @jmp said:

    I think, but am not certain, that our compiler would have been within its rights to compile value = (value << shift) >> shift to a no-op.

    If it did, half of Linux ecosystem would collapse. More likely, it would replace it with bitwise and/or - which might or might not do what you want, and disabling optimizations might or might not change the behavior of the function.


  • area_pol

    According to the as-if rule, the compiler is free to optimize it away, as long as it doesn't have any side effects (and it doesn't in this case). Unfortunately, a lot of people (especially C programmers) view such code as "clever" rather as "shit which is bound to explode at some point".



  • Well, undefined behavior doesn't require the semantics of the as-if rule. That's just the compiler being nice.


  • Discourse touched me in a no-no place

    @jmp said:

    I think, but am not certain, that our compiler would have been within its rights to compile value = (value << shift) >> shift to a no-op.

    That's not necessarily a safe assumption. Compiler authors have become a lot better at getting the assumption checks in these cases correct. In practice, the language compiler part would probably have to tell the optimizer that it can assume that there are no overflows in the shifts; if it did that, your assertion would be correct. Which would be a compiler optimization setting or pragma or something like that, i.e., under the control of you guys.

    If it just makes that assumption without giving you a chance to shut it off, you've got a crappy compiler.


  • area_pol

    No, it doesn't, which makes this example double-terrible. We have a potential no-op, and if not, an undefined behavior. It takes balls to write such abhorrent code, and assume it works.



  • Nope. Wrong. The compiler doesn't need to tell the optimizer that the shift doesn't overflow. The language standard says exactly exactly that (for signed) variables. If your compiler compiler cannot be configured to give massive warning any time you shift a signed variable, then you have a crappy compiler.


  • Banned

    @NeighborhoodButcher said:

    According to the as-if rule, the compiler is free to optimize it away, as long as it doesn't have any side effects (and it doesn't in this case).

    Wrong. Unless it can be proven that no 1-bits will leave the number on the left side, it does have side-effects, and the compiler has no right to ignore it.

    At least that's how it works for unsigned. For signed, we have UB, so yeah, I think at least one compiler would gladly optimize it all out, just like it optimizes null checks it doesn't like.



  • Bonus, from the MSDN page on bitshifts:

    If you left-shift a signed number so that the sign bit is affected, the
    result is undefined. The following example shows what happens in Visual
    C++ when a 1 bit is left-shifted into the sign bit position.

    [Code example with comments containing expected values]

    Don't tell them what it does goddamn it now they'll think it's okay

    @gwowen said:

    Nope. Wrong. The compiler doesn't need to tell the optimizer that the shift doesn't overflow. The language standard says exactly exactly that (for signed) variables. If your compiler compiler cannot be configured to give massive warning any time you shift a signed variable, then you have a crappy compiler.

    Compiled with /W4 in VS2013, no warning. I think. Might have been turned off.


  • Banned

    @jmp said:

    Don't tell them what it does goddamn it now they'll think it's okay

    Apparently, it is okay in MSVC.



  • Please tell me you're this guy.

    (FWIW my comment that the compiler is within its rights to optimise it out was in the context of signed shifts; I'm aware that unsigned shifts would have to be replaced by a mask operation).


  • area_pol

    @Gaska said:

    Wrong. Unless it can be proven that no 1-bits will leave the number on the left side, it does have side-effects, and the compiler has no right to ignore it.

    No bits will ever leave the left side according to 5.8.2 of the Standard, so the no-op is possible.


  • Winner of the 2016 Presidential Election

    @jmp said:

    Please tell me you're this guy.

    Nope, that's a a well-known German blogger. That bug report is probably one of the reasons why he wrote an article about correct overflow checking in C.

    Bonus WTF: The blog software he uses is written in C, BTW.



  • @asdf said:

    Nope, that's a a well-known German blogger.

    Why did you write that with a stereotypical Italian accent then?


  • Winner of the 2016 Presidential Election

    @LB_ said:

    Why did you write that with a stereotypical Italian accent then?

    You know, it's not impossible to be Italian and (kinda) German at the same time.

    Damn, you were faster than my inner grammar nazi.


  • Banned

    @jmp said:

    Please tell me you're this guy.

    No I'm not. That guy is idiot.

    @jmp said:

    (FWIW my comment that the compiler is within its rights to optimise it out was in the context of signed shifts; I'm aware that unsigned shifts would have to be replaced by a mask operation).

    You were right in that it can optimize it away, but were wrong in the reason for it.

    @NeighborhoodButcher said:

    No bits will ever leave the left side according to 5.8.2 of the Standard, so the no-op is possible.

    That's what I said in the paragraph that starts immediately after the end of the one you quoted. In different words, but still.


  • Winner of the 2016 Presidential Election

    @Gaska said:

    That guy is idiot.

    Are we all doing stereotypical accents now? :trollface:


  • Banned

    I don't care about accents in text communication.


  • Winner of the 2016 Presidential Election

    I find it hard to read that sentence out loud without faking an Eastern European accent.



  • This code should have been Linted or some similar static checker.... at-least to something equivalent to MISRA C++ rules. Safety critical or not, it saves time if not lives.
    I believe avionics code is around 500 to $800 per line?



  • @Helix said:

    I believe avionics code is around 500 to $800 per line?

    Between 500 (no unit) and 800 dollars? Isn't that what it costs to have someone who has used Discourse install it on their own computer?


  • Banned

    @LB_ said:

    Between 500 (no unit) and 800 dollars?

    I think he wrote it originally as "500 to 800$", which would be kinda, sorta OK, but his linter caught it corrected to something that doesn't make sense.


  • Discourse touched me in a no-no place

    @gwowen said:

    The compiler doesn't need to tell the optimizer that the shift doesn't overflow. The language standard says exactly exactly that (for signed) variables.

    Aaaaand there you go, assuming that the only thing you ever need to do when writing a compiler is comply with the platform-independent language standard. Which is kind of cute in a still-a-programming-virgin way. You get a 🍭.

    In reality, there's a lot of abuse of features out there, and just “breaking” the code that uses these sorts of tricks gets you lots of complaints. And very angry users. You can bang on the spec all you want, but making lots of legacy code fail causes real grief. When you dive beneath the level you're used to dealing with, stuff gets rather more complicated in many “interesting” ways…



  • @jmp said:

    Don't tell them what it does goddamn it now they'll think it's okay

    Reminds me of my new favourite quote about the undefined-behaviour-works-on-my-machine mentality:

    Somebody once told me that in basketball you can’t hold the ball and run. I got a basketball and tried it and it worked just fine. He obviously didn’t understand basketball.



  • @Gaska said:

    I don't care about accents in text communication.

    Óh, bůt I dó...


  • Banned

    What's funny is that Polish letter "ó" sounds exactly like the sound at the end of English word "do".



  • All the situations where GCC optimizes out null checks are in a similar "Oops, undefined behavior, don't have to care" category. If you're dereferencing something and then testing it for null, you are already dead.

    You were right in that it can optimize it away, but were wrong in the reason for it.

    I'm not NeighbourhoodButcher. And I don't think his explanation is wrong, just not stating the obvious point that the case where things get shifted off the top doesn't exist because it's undefined.

    @Helix said:

    This code should have been Linted or some similar static checker.... at-least to something equivalent to MISRA C++ rules. Safety critical or not, it saves time if not lives. I believe avionics code is around 500 to $800 per line?

    Yes, probably. For all I know it might have been, and it didn't catch that. Code was written in 2002 - how mature was the safety-critical stuff back then?



  • @jmp said:

    Please tell me you're this guy

    I suppose there's no point in asking why he didn't just declare the variable to be volatile in his expression. That should eliminate any optimization on it. It would still be UB, but would probably be implemented by simply executing the corresponding assembly instructions, resulting in whatever the processor does in that situation, which should be well defined for each processor architecture that may be used. If the code is only supposed to run on one architecture (and if there is something in place to detect and make unit tests fail if it is compiled on an architecture with different behavior), it might be acceptable.



  • That is not what volatile does. It'd still be undefined behavior.

    The question is why he didn't just build with -ftrapv (or whatever the flag is).



  • @jmp said:

    That is not what volatile does. It'd still be undefined behavior.

    I said it would still be UB. But with all optimizations disabled for the expression (what volatile actually does), it would probably do what he wanted.



  • Declaring it volatile would just mean it has to do the load and the store. The shifts are effectively a no-op, so it could omit those.

    Also, once you invoke undefined behavior, the compiler is allowed to do literally anything. It can send demons flying out of your nasal cavities if it wants to. "It would probably do what you want" is nothing, because with undefined behavior, it can also delete your OS and then print "sausage slaps" 1000 times.



  • @jmp said:

    Code was written in 2002 - how mature was the safety-critical stuff back then?

    It was already mature in the late '80s when I was writing avionics code. Hell, even the "offline" stuff to deal with maintenance schedules was safety-critical.


  • ♿ (Parody)

    @ben_lubar said:

    It can send demons flying out of your nasal cavities if it wants to.

    I'll bet it can't.


  • Winner of the 2016 Presidential Election

    @boomzilla said:

    @ben_lubar said:
    It can send demons flying out of your nasal cavities if it wants to.

    I'll bet it can't.

    Heathen! Just wait for Judgement Day!


  • Banned

    @jmp said:

    All the situations where GCC optimizes out null checks are in a similar "Oops, undefined behavior, don't have to care" category. If you're dereferencing something and then testing it for null, you are already dead.

    Yes. You're also dead if you first test for null and then dereference it anyway. Good luck debugging this case - either with debugger or prints!



  • @tufty said:

    002 - how mature was

    First Google :
    "JSF C++" in short, was kind of revolutionary, as it signalled a move away from Ada as the mandated programming language for avionics software by the US Department of Defence.

    Since JSF C++ was published around 2005, I call BS on doing safety critical c++ in the 80's. I hate to think how crap the c++ compilers and standard was back then. Its 2015 and there are still 'grey areas'.



  • @Helix said:

    I call BS on doing safety critical c++

    1. Oddly enough, the US DoD are not the only developers of safety critical code, although they are a major buyer of safety critical code. They were neither my employer nor the client for any of my avionics code.
    2. I'd missed the OP was C++. Needless to say, the stuff we were doing was not in C++.
    3. We didn't start experimenting with until '89-'90 or so, still using CFront based compilers.
    4. Can't remember the date of the first C++ project we deployed.
    5. I don't know, but I suspect JSF C++ merely codified industry best practices anyway. Possibly not best practices that were ever done all in one place before, but still.
    6. Nothing precludes you from doing safety critical code in any language you like, regardless of compiler maturity or language standardisation.
    7. I would wager that there are more grey areas in the C++ spec now than in the early '90s.

    So call all the BS you like.


  • area_pol

    @tufty said:

    I would wager that there are more grey areas in the C++ spec now than in the early '90s.

    no


  • Discourse touched me in a no-no place

    @tufty said:

    I would wager that there are more grey areas in the C++ spec now than in the early '90s.

    Well, if you regard the spec from the early '90s as one huge grey area whereas now there are many much smaller grey areas, maybe you're right…



  • ok then, I will:

    BS();
    BS();
    BS();
    BS();
    BS();
    BS();
    BS();
    BS();
    BS();
    BS();
    BS();
    BS();
    BS();
    while(1)
    {
    BS();
    }


  • Banned

    Since calling BS has no side effects, your whole post has been optimized away.



  • @LB_ said:

    Between 500 (no unit) and 800 dollars?

    Yes. "Between five hundred and eight hundred dollars" is what you'd normally say. The best way to write that is probably "$500-800", but I can also see the logic in writing "500 to $800".

    @Gaska said:

    I think he wrote it originally as "500 to 800$", which would be kinda, sorta OK, but his linter caught it corrected to something that doesn't make sense.

    The $ sign usually goes in front. Putting it after is just weird.



  • GCC strikes again.


  • Banned

    @anotherusername said:

    The $ sign usually goes in front.

    The $, yes. A currency symbol in general, it's mixed. Some are written in front, some are written at the end, and I'm sure some currencies have mixed convention, or are written differently depending on region.



  • @anotherusername said:

    @LB_ said:
    Between 500 (no unit) and 800 dollars?

    Yes. "Between five hundred and eight hundred dollars" is what you'd normally say. The best way to write that is probably "$500-800", but I can also see the logic in writing "500 to $800".

    @Gaska said:

    I think he wrote it originally as "500 to 800$", which would be kinda, sorta OK, but his linter caught it corrected to something that doesn't make sense.

    The $ sign usually goes in front. Putting it after is just weird.

    I actually originally wrote "500 to 800USD". However I changed it to something that Blakey could understand. Granted the dollar sign is ambiguous as I could be talking about Tongan pa'anga, or even a string variable or hexadecimal memory location. Posting here is a lot like learning to code defensively via code reviews and then end up with something that no one understands.

    Filed under ISO Currency code 4217



  • @Helix said:

    I actually originally wrote "500 to 800USD".

    This is better than what you actually wrote. Blakey would have had no problem understanding it. He might have complained anyway, but that's beside the point.


Log in to reply