A problem with big numbers

FrostCat

also because discourse tried using that for my smiley the first time.

The sad thing (about this site) is I assumed that was some kind of anti-internationalization emoji at first.

Steve_The_Cynic

Not quite all -- if you are on a specific platform, the platform C and C++ ABIs will make much more stringent guarantees about layout than the C and C++ standards can make. (The standards committee had to deal with wacko boxes like Cray vector machines.)

Otherwise, how would COM ever work? It's not cross-ABI portable -- but it never was intended to be, nor is the ugly trick you're pulling here.

Hey! Don't accuse me of pulling any such ugly trick! It wasn't code I would dream of writing. I thought they were maniacs for doing it, but maybe I'm just more sensitive that most about UB. But it was clear that the guilty parties (may they moulder in Heck) had no fear of UB, which is more than slightly scary.

Steve_The_Cynic

@delfinom said:

I mean, there's no definition of LHS or RHS subexpressions being evaluated first in the C standard (there is for a few operators but assignment is not one of them) which leads to the undefined behavior and quirkyness between platforms in this case.

That was more or less my point. There's no sequence point at the assignment operator, so in the original assignment expression as a whole, there is a write to bp and an unrelated read from bp without a sequence point between, and that, in a UB-avoiding world (the only sane kind of world), is a fundamental no-no...

Steve_The_Cynic

@tarunik said:

When was this? Somebody needs a sound thumping on the head with varadic templates and/or currying operators (look at the operator% Boost.Format uses for an example of the latter).

When? 2006 at the latest, so no C++11, and for historical (hysterical?) reasons there were sharp restrictions on the use of code that relied on new/delete (e.g. much of the STL, because specialised allocators can only be changed at compile time, being template parameters and all).

Working in that place taught me why you never, ever override operator& (address-of), and you never, ever write an operator cast-to-bool.(1) The former caused weird, weird, weird compiler errors when some nutjob added such an operator to a class, and some other nutjob who might be posting here as Steve The Cynic was using that class as a value type in a std::map. Cast-to-bool caused entertaining problems when added to a class that was being used as a key-type in a std::map, where the map suddenly became equivalent to a map of bool-to-value, with only two members possible.

The correct operator to use is a cast to an unusable pointer type (such as a pointer to a private-declared, never-defined struct). When you need(2) to write if ( smartPointer ), the operator will be called because it is the best match (it converts to an if-able value) and it returns a reinterpret_cast of this to the private pointer when the object is "true" or NULL when the object is "false".

(1) "Never, ever" is a bit strong, but it's a good place to start with both these concepts.

(2) You might want to write this for so-called symmetry with naked pointers, but you don't need to. Remember this. CAVEAT: opinion: It's clearer if you write if ( !smartPointer.isNull() ) or if ( smartPointer.isNotNull() ).

boomzilla

@Gaska said:

But as per that definition of UB, it's not it - because the code is perfectly unambiguous for the compiler and will do exactly the thing it says.

This is the equivalent of saying that there's no UB in C code dereferencing a null pointer because the code of the compiler is clear as to how it handles this situation. I'M STILL NOT TALKING ABOUT UB IN THE COMPILER. I'M TALKING ABOUT THE SPEC FOR THE APPLICATION.

@Gaska said:

since when I'm a recognizable person on this forum

Some people can't remember WTF they post themselves. I tend to have a decent feel for other posters and remember what sort of poster they are. You have a fairly distinct name, and I remember your long name bitching about how you couldn't get a squiggly over the a in your username.

Gąska

@boomzilla said:

This is the equivalent of saying that there's no UB in C code dereferencing a null pointer because the code of the compiler is clear as to how it handles this situation.

UB is about specification, not code.

@boomzilla said:

I'M STILL NOT TALKING ABOUT UB IN THE COMPILER. I'M TALKING ABOUT THE SPEC FOR THE APPLICATION.

UB is only when the specification says something is undefined. Which is different from when the specification doesn't specify something at all, and different from when the app works not according to spec.

@boomzilla said:

squiggly over the a

Under the a.

boomzilla

@Gaska said:

UB is only when the specification says something is undefined. Which is different from when the specification doesn't specify something at all, and different from when the app works not according to spec.

This is at least a reasonable explanation for not ruining the joke, but it wasn't clear why you were trying to ruin the joke earlier.

@Gaska said:

Under the a.

Oh, sorry, I think I have you confused with someone else, then.

Gąska

@boomzilla said:

You have a fairly distinct name, and I remember your long name bitching

@boomzilla said:

Oh, sorry, I think I have you confused with someone else, then.

Brillant.

tarunik

@Steve_The_Cynic said:

Hey! Don't accuse me of pulling any such ugly trick! It wasn't code I would dream of writing. I thought they were maniacs for doing it, but maybe I'm just more sensitive that most about UB. But it was clear that the guilty parties (may they moulder in Heck) had no fear of UB, which is more than slightly scary.

So, it's safe to say that you have never applied a 50' pole to COM, either?

@Steve_The_Cynic said:

When? 2006 at the latest, so no C++11, and for historical (hysterical?) reasons there were sharp restrictions on the use of code that relied on new/delete (e.g. much of the STL, because specialised allocators can only be changed at compile time, being template parameters and all).

Sounds like currying operators are your friend then, think along the lines of:

blah("$1, $2, $3") % arg1 % arg2 % arg3;

@Steve_The_Cynic said:

Working in that place taught me why you never, ever override operator& (address-of)

Agreed -- making objects that can't have their address taken is a recipe for brutally ugly surprises.

@Steve_The_Cynic said:

The correct operator to use is a cast to an unusable pointer type (such as a pointer to a private-declared, never-defined struct). When you need(2) to write if ( smartPointer ), the operator will be called because it is the best match (it converts to an if-able value) and it returns a reinterpret_cast of this to the private pointer when the object is "true" or NULL when the object is "false".

Good old operator void* has been used quite a bit as well -- it seems to work just as well?

Steve_The_Cynic

@tarunik said:

So, it's safe to say that you have never applied a 50' pole to COM, either?

I'm not a great fan of COM (COM itself is OK, but DCOM has issues, and the documentation is horrible, and the Microsoft developer support tools like ROTViewer are worse), and the way it is called in C++ does impose a restriction on how C++ compilers can implement virtual functions on COM-using environments. It's not an onerous restriction, but, for example, it might cause hilarity when code is compiled using cfront compared to a native C++ compiler. (cfront puts the vtable pointer at the end of the object, not the beginning. It's not clear to me whether a cfront-style compiler would insert a non-empty dummy part before the vtable pointer in a pure interface class. If it did, FOOM!)

@tarunik said:

@Steve_The_Cynic said:

When? 2006 at the latest, so no C++11, and for historical (hysterical?) reasons there were sharp restrictions on the use of code that relied on new/delete (e.g. much of the STL, because specialised allocators can only be changed at compile time, being template parameters and all).

Sounds like currying operators are your friend then, think along the lines of:

blah("$1, $2, $3") % arg1 % arg2 % arg3;
```</blockquote>

Hmm... Looks interesting.  Sort of like iostreams, but using a proper format string to, among other things, show a picture of the output...  Cool.

@tarunik <a href="/t/via-quote/5182/59">said</a>:<blockquote>@Steve_The_Cynic <a href="/t/via-quote/5182/54">said</a>:<blockquote>Working in that place taught me why you never, ever override operator& (address-of)</blockquote>
Agreed -- making objects that can't have their address taken is a recipe for brutally ugly surprises.</blockquote>
The problem it caused for me (someone else added the address-of operator to the class) was that it made the type of the address of the object be something other than "pointer to the typeof(the object)", and the deep-juju STL template jungle didn't appreciate that.

@tarunik <a href="/t/via-quote/5182/59">said</a>:<blockquote>@Steve_The_Cynic <a href="/t/via-quote/5182/54">said</a>:<blockquote>The correct operator to use is a cast to an unusable pointer type (such as a pointer to a private-declared, never-defined struct).  When you need(2) to write if ( smartPointer ), the operator will be called because it is the best match (it converts to an if-able value) and it returns a reinterpret_cast of this to the private pointer when the object is "true" or NULL when the object is "false".</blockquote>
Good old `operator void*` has been used quite a bit as well -- it seems to work just as well?</blockquote>
Using a pointer to a private-declared not-defined struct has the advantage that maniacs who want to store the result have to think about what they are doing (and they would normally end up storing it in a bool.  `operator void*` tempts one to store the object in a `void *` which could then be used elsewhere.  (Alternatively, `operator void*` allows one to *accidentally* store it in a `void *` variable, leading to hysterical results a bit later on.  This might suggest that the correct operator is `operator volatile const void *`.)

Bulb

@tarunik said:

blah("$1, $2, $3") % arg1 % arg2 % arg3;

This is called Boost.Format or Boost.Locale.Format. The former is backward compatible with printf and uses system locale and has a .str() method, the later has additional formats for date, time and money and uses locale of the output stream used, so it can only print to ostream. Both work by configuring the stream according to the format flags and using operator<<, so they can print any type that has that operator.

At work I use the former for log messages, with additional wrapper defining operator, so the macros originally forwarding to fprintf didn't have to change.

For the later I have my own version that uses #,python-brace-format (easier to understand for translators), method chaining (like format("{count} {type} items").arg("count", count).arg("type", type), because each needs two arguments), has some special formats (like distance that automatically chooses between m, km, ft, yd and mi and number of decimal digits depending on value and locale setting) and does the locale stuff with custom code, because Android does not have locale data available in C/C++ and WinCE don't have C++ locale library at all.

@Steve_The_Cynic said:

unusable pointer type (such as a pointer to a private-declared, never-defined struct)

Whether it is private does not matter, because identifiers are private, types just are. So e.g. template will happily bind to it (but won't be able to dereference it, of course). So this is more or less equivalent to

@tarunik said:

Good old operator void*

@Steve_The_Cynic said:

This might suggest that the correct operator is operator volatile const void *

The best option seems to be a pointer to a member of the smart pointer. Templates and auto will still bind it, but it has even fewer conversions. But I once got an ICE from MSC++ using it, so I settled with void * too.

Of course in C++11 the explicit operator bool does the trick.

chubertdev

I'm glad that I can use Python's "%s, %s, %s".format(arg1, arg2, arg3)

Bulb

@chubertdev said:

Python's "%s, %s, %s".format(arg1, arg2, arg3)

Sorry, you've got that mixed up. It's either

"{}, {}, {}".format(arg1, arg2, arg3)

(why would it otherwise be called #,python-brace-format?) or

"%s, %s, %s" % (arg1, arg2, arg3)

which is plain old #,c-format, obsolete, but appears to still be available in python 3.4 (and has the %r format for using repr; the new format does not seem to).

chubertdev

Whoops, you're right. Guess who's been working in 2.4 recently.

EvanED

@tarunik said:

Good old operator void* has been used quite a bit as well -- it seems to work just as well?

No offense, but that sounds like an awful idea... imagine interfacing with some C function foo(void* p). Now foo(myobj) compiles when you meant to say foo(&myobj), and will probably crash at some random time.

tarunik

@EvanED said:

No offense, but that sounds like an awful idea... imagine interfacing with some C function foo(void* p). Now foo(myobj) compiles when you meant to say foo(&myobj), and will probably crash at some random time.

It returns reinterpret_cast<void*>(this), so that blooper is accounted for already. It's usually not a concern anyway for the classes that implement it -- they're quite non-POD, as a general rule, and as thus never are brought close to C APIs in that way.

Bulb

@EvanED said:

foo(void* p)

That's exactly the reason why member pointers are better. They don't have this conversion. But as taurik correctly says, this kind of objects is never

@tarunik said:

brought close to C APIs in that way

anyway.

ben_lubar

@Bulb said:

That's exactly the reason why member pointers are better.

Go conversions are always explicit, so assigning a uintptr value to a string variable and vice versa are illegal operations.

Go Playground - The Go Programming Language

chubertdev

@ben_lubar said:

Go conversions are always explicit

Do they talk about Belgium?

Bulb

@ben_lubar said:

Go conversions are always explicit

The real solution is reference types that can't be null. So far only Haskell and Rust do that (as far as I know) and the former is referentially transparent, so it does not really distinguish reference types.

dkf

@Bulb said:

The real solution is reference types that can't be null. So far only Haskell and Rust do that (as far as I know) and the former is referentially transparent, so it does not really distinguish reference types.

There are other languages that are NULL-free. Tcl completely lacks NULL and the real type logic is that of immutable references without value identity — you can't ask whether two references refer to the same identity, merely whether they hold equal values — and strings are the base type that all other types are subtypes of (which isn't to say that all values are currently represented by a character sequence) which is why some people claim that everything in Tcl is a string; it's true in the type logic and the type mutation logic. Indeed, NULL in many ways represents a complete absence of information: there is no variable, there is no mapping in the dictionary, there is no such information. Lots of Tcl programmers use the empty string a bit like it is a NULL, but it's just a regular value with regular value semantics.

Tcl is a language with entirely eager evaluation and operational semantics. It can pretend to do other things. :-)

Bulb

@dkf said:

Tcl

True. I know tcl a bit, but I was thinking about more usual strongly typed languages so tcl (or shell for that matter) didn't cross my mind.

antiquarian

@Bulb said:

The real solution is reference types that can't be null. So far only Haskell and Rust do that (as far as I know) and the former is referentially transparent, so it does not really distinguish reference types.

Haskell doesn't have null values per se, but functions that return something that may not have a value will typically use a Maybe type:

data Maybe a = Just a | Nothing

The compiler will warn you if you forget to specify what happens if Nothing comes out of the function you're calling.

Bulb

@antiquarian said:

Maybe

Of course, there are cases where you do want to permit "null" somewhere, so there has to be an option for it. Rust has analogous type. But it efficiently prevents running into nulls from silly bugs and that's the point.

dkf

@Bulb said:

True. I know tcl a bit, but I was thinking about more usual strongly typed languages so tcl (or shell for that matter) didn't cross my mind.

Oh, it's not a general thing across all scripting languages. Perl, Python and Ruby all have a null-like “value”. The whole strong/weak typing thing is independent of reference nullability.

ben_lubar

What's the default value of a reference to, say, a file handle if I don't set it?

Bulb

@ben_lubar said:

What's the default value of a reference to, say, a file handle if I don't set it?

In Haskell and Rust there is simply no way to create a reference without setting it (in Rust it may be possible to declare it, but it can't be used until the compiler is sure it was set). If you need a reference that can be unset e.g. due to error, you need the

@antiquarian said:

data Maybe a = Just a | Nothing

nullable type or some other kind of error handling (Rust has exceptions, Haskell does not so there the Maybe monad is usually the only option). And the nullable type can't be converted to the non-nullable base one without saying what happens if it is null.

chubertdev

@Bulb said:

Of course, there are cases where you do want to permit "null" somewhere, so there has to be an option for it. Rust has analogous type. But it efficiently prevents running into nulls from silly bugs and that's the point.

That's the problem with a lot of code. The "null" value should be the exception, not the rule. Similar to nullable-columns in a database.

dkf

@chubertdev said:

The "null" value should be the exception, not the rule.

As long as it means “this information is not there”, it's not too awful. To be fair, that absence might in turn be imbued with higher-level meaning (which should be documented) but the basic bit must relate to absence of info.

chubertdev

@dkf said:

As long as it means “this information is not there”, it's not too awful. To be fair, that absence might in turn be imbued with higher-level meaning (which should be documented) but the basic bit must relate to absence of info.

Exactly. null with meaning, not null by default.

ben_lubar

Still, that brings the question of what your default value is. If it's not null, it's some other value with the same semantics.

Bulb

There ain't no eFfing default value. The compiler statically checks that the value is defined and flags the code as invalid if it may not be and you didn't explicitly request the nullable variant.

Bulb

Value types (like int) don't have any value with "same semantics as null" either, in any language that has them.

Tcl has value semantics for everything (and is weakly typed; everything is, formally, a string and interpretation as anything else depends on the operation, not the value) and so does Haskell (except that is strongly typed). So there is no place for null.

Rust has explicit reference semantics for pointer types, but the main point of Rust is ability to do static checking ensuring that references are guaranteed to point to valid objects when they are used.

error

@Bulb said:

Value types (like int) don't have any value with "same semantics as null" either, in any language that has them.

TypeScript:

var foo: number = null;
var bar: boolean = null;

Bulb

@error said:

@Bulb said:
Value types … in any language that has them.

TypeScript

…ain't one of those.

The fact that language has a number or boolean type does not mean it treats it as a value type. TypeScript apparently does not.

error

@Bulb said:

@error said:
TypeScript

…ain't one of those.

The fact that language has a number or boolean type does not mean it treats it as a value type. TypeScript apparently does not.

It is a value type. It's just one of those new fangled languages that compile down to JavaScript. In JavaScript a variable doesn't have a fixed type, it has the type of whatever it was assigned last. TypeScript offers a bit more type safety, but has to respect that any variable can be assigned null or undefined (which are types in their own right) because the underlying language allows it.

Tl;dr: In JavaScript assigning null to a variable changes its type.

Filed under: typeof NaN === 'number', x = 5, x = null, typeof x === 'object'

Bulb

@error said:

It is a value type

I'd say JavaScript does not really have those; but it's true that numbers compare by value, not identity, so they are not really reference types either. They are kind of somewhere in between.

@error said:

In JavaScript assigning null to a variable changes its type.

In JavaScript variables don't have types, objects do.

It does not, however, prevent the languages that compile to JavaScript from having strongly typed non-nullable variables. Many languages can be compiled to JavaScript and many of them have such variables. TypeScript is intended to only be a thin layer on top of JavaScript rather than full separate language so it does not attempt it.

OffByOne

@Bulb said:

[Numbers] are kind of somewhere in between.

Musaran

@Gaska said:

@Steve_The_Cynic said:
Treating random patches of memory as if they were C++ objects by creating C-like char[] buffers and then ObjectType &refvar = (ObjectType &)c_like_buffer_variable;.(1)

This is the only way to handle variable-sized structures.

They should at least have used the right tool, placement new :

ObjectType& refvar = * new(&c_like_buffer_variable) ObjectType();

Gąska

@Musaran said:

They should at least have used the right tool, placement new :

Only if the class has non-trivial constructor. Because otherwise, why bother?

tar

@Gaska said:

Only if the class has non-trivial constructor. Because otherwise, why bother?

Because it's one less thing to change when the class inevitably acquires a non-trivial constructor sometime down the line?

Gąska

@tar said:

Because it's one less thing to change when the class inevitably acquires a non-trivial constructor sometime down the line?

In C++, you either design a class as POD, or you design it as non-POD. Changing POD into non-POD has so many consequences that only someone who has no idea what a POD is would do such thing.

Remember: trivial constructor is better than non-trivial one - until you need initialization logic. Good C++ programmer designs his classes such that as few classes as possible have mandatory initialization logic.

tar

@Gaska said:

Changing POD into non-POD has so many consequences that only someone who has no idea what a POD is would do such thing.

So you're suggesting the change request originated outside of engineering? That seems valid.

Jaloopa

@tar said:

someone who has no idea what a POD is

tar

Plain Ol' Data structure (or a C-style struct with no member functions)

Their main advantage is that you can use C-style idioms with them, and pretend it's still 1990 in your codebase, which I'm sure will delight your coworkers.

Jaloopa

Ah, like a POCO/POJO. Suspected from context that it was something along those lines

tar

As seen in the wild, from a developer who should've known better...



class Whatever {

//...

Whatever(const Whatever &other) {

memset(this, &other, sizeof(this);

}

virtual ~Whatever() { /...*/ }

//...

};

"Help! I added a virtual function to my class, and my code is crashing really strangely now!"

tarunik

@tar said:

Plain Ol' Data structure (or a C-style struct with no member functions)

Their main advantage is that you can use C-style idioms with them, and pretend it's still 1990 in your codebase, which I'm sure will delight your coworkers.

You can have member functions without disturbing PODness, you just need to maintain both a) triviality and b) standard-layout-ness (see the C++ standard for the gory details).

@Jaloopa said:

Ah, like a POCO/POJO. Suspected from context that it was something along those lines

Not quite -- PODness is more of a language-interop issue than a library-interop issue.

tar

@tarunik said:

You can have member functions without disturbing PODness, you just need to maintain both a) triviality and b) standard-layout-ness (see the C++ standard for the gory details).

If we're going to start splitting these hairs, which revision of the C++ standard are we working against?

tarunik

@tar said:

If we're going to start splitting these hairs, which revision of the C++ standard are we working against?

C++11 or C++14 (for the former, I use the N3337 draft as a proxy)