Endian Attack!
-
In some low-level protocol handling code written by our vendor -- yes, we deal with protocols down to the bit level out here:
[code]
// Packer.h
#define PACK(n) attribute((packed,aligned(n)))
// ProtocolData.H
struct ProtocolTrailer
{
unsigned short CRC;
unsigned char ProtocolTerminator;
};// ProtocolHandler.H
class ProtocolHandler
{
public:
/* ... /
void SomeFunctionHandler(/ ... /);
/ ... /
void FixEndianIssues(unsigned short &);
};
// ProtocolHandler.C
void ProtocolHandler::SomeFunctionHandler(/ ... /)
{
ProtocolTrailer SomeProtocolTrailer;
/* do some stuff that sets SomeProtocolTrailer appropriately */
FixEndianIssues(SomeProtocolTrailer->CRC);
}
// The definition of FixEndianIssues is irrelevant, and not a WTF
// so it's not posted here (unless you demand it, that is)
[/code]So...we have handling code for a relatively obscure and railroading-specific protocol; like a good protocol, it uses a CRC for error checking. All the data in the protocol is little-endian -- likely because the protocol designers were dealing with little-endian CPUs in the field devices that speak it. This does not matter for most fields, as they are single bytes; however, the CRC is two bytes long, so it needs endianness handling. Of course, since the protocol structures represent an on-the-wire protocol, they are defined packed with an alignment of 1 -- this is what the macro-wrapped
__attribute__
gunk means.All this is fine and dandy, until you get to the call of
FixEndianIssues
. With the current declaration, the compiler vomits acannot bind packed field ‘Trailer->CRC’ to ‘short unsigned int&’
error, because the packing means that the CRC field may not be naturally aligned for the computer the code is running on, which causes a runtime crash on pretty much everything that's not an x86 PC.The best part? If they weren't concerned about portability, they wouldn't have run afoul of this portability-related bug, as x86 is a little-endian CPU architecture to begin with!
-
because the packing means that the CRC field may not be naturally aligned for the computer the code is running on, which causes a runtime crash on pretty much everything that's not an x86 PC.
I hate that shit.
Related: OSX wants functions locations to be 16-byte aligned, IIRC.
-
Does this platform not allow marking a pointer/reference variable as unaligned?
-
Does this platform not allow marking a pointer/reference variable as unaligned?
You actually can do that, but I found it simpler to simply refactor
FixEndianIssues
to pass and return by value, sincentohs
and friends don't faff with pointers/references, either. Besides: the extra halfword copies are likely cheaper than unaligned loads and stores!
-
Regardless of any other issues raised...
however, the CRC is two bytes long
If they weren't concerned about portability
unsigned short CRC;
Why is someone using a variable (depending on platform) width data type, instead of something like - say -
uint16_t
? :)
-
Maybe they're old skool?
I mean, look, they're also using
char
instead of, say,char8_t
...
-
But there is achar
is 8 bits by definition in C.*int8_t
if you need them.Edit: Yes - that's bollocks -
char
is at least 8 bits, it's how many bytes long it is that's defined, as 1.
-
char8_t
isn't a real type... but in C++11,char16_t
andchar32_t
are, for UTF-16 and UTF-32 respectively. They exist to remove the ambiguity ofwchar_t
which can be either 16 or 32 bits depending on the library(/compiler?).
-
Why is someone using a variable (depending on platform) width data type, instead of something like - say - uint16_t?
The original code uses some typedef that resolves to an unsigned short at the end of the day. I omitted the typedef from my example for clarity's sake.
-
Anonymization strikes again...
-
They exist to remove the ambiguity of wchar_t which can be either 16 or 32 bits depending on the library(/compiler?).
In C++,wchar_t
is a builtin type set by the compiler and the library has no say.
-
In C++, wchar_t is a builtin type set by the compiler and the library has no say.
Sure, but what if you write code that defines wchar_t as 16 bits and then use that code on a compiler that defines it the other way?
The old multiplayer game Galactic Bloodshed has a bizarro form of this, where it mostly expects to use lrand48, but if you compile it on a system without that set of functions (like Dos) it blithely uses a 16-bit random number generator. That causes a problem with a 200Kx200K grid: all your stars wind up in one corner of the map, and the universe generator falls into an infinite loop trying to place 'em.
It's basically the same problem.
-
@EvanED said:
In C++, wchar_t is a builtin type set by the compiler and the library has no say.
Sure, but what if you write code that defines wchar_t as 16 bits and then use that code on a compiler that defines it the other way?
As EvanED said, set by the compiler, library has no say. Different compiler, different size.
@FrostCat said:
The old multiplayer game Galactic Bloodshed has a bizarro form of this, where it mostly expects to use lrand48, but if you compile it on a system without that set of functions (like Dos) it blithely uses a 16-bit random number generator. That causes a problem with a 200Kx200K grid: all your stars wind up in one corner of the map, and the universe generator falls into an infinite loop trying to place 'em.
It's basically the same problem.
No, that's a case of a bad programmer. They created a fallback that fell over. No different than assuming code written for x86 is going to magically work correctly on x64.
char* p ...
DWORD dw = (DWORD)p;
Works on x86, not so much on x64.
-
I ended up checking. It's not the compiler that defines it, but the standard library. Specifically, in stddef.h.
You may say "those are one and the same!" but then someone like me will point out that compilers like clang can use either
libc++
orglibc
.
-
That header is supplied by the compiler, not the library! This is explained in the GCC manual as follows:
The ISO C standard defines (in clause 4) two classes of conforming implementation. A conforming hosted implementation supports the whole standard including all the library facilities; a conforming freestanding implementation is only required to provide certain library facilities: those in <float.h>, <limits.h>, <stdarg.h>, and <stddef.h>; since AMD1, also those in <iso646.h>; since C99, also those in <stdbool.h> and <stdint.h>; and since C11, also those in <stdalign.h> and <stdnoreturn.h>. In addition, complex types, added in C99, are not required for freestanding implementations.