NetBeans vs UTF-8

Gąska

ISO/IEC 9899:1990 says:

6.1.2.5 Types
(...)
A pointer to void shall have the same representation and alignment requirements as a pointer to a character type.

Note this is not RFC 2119-compliant "shall" - it's an absolute requirement.

6.2.2.3 Pointers
A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again: the result shall compare equal to the original pointer.

So you were right.

Now, let's see how wrong I was about void in C++...

Edit:
ISO/IEC 14882:1998 said:

3.9.2 Compound types
(...)
4. Objects of cv-qualified (3.9.3) or cv-unqualified type void* (pointer to void), can be used to point to objects of unknown type. A void* shall be able to hold any object pointer. A cv-qualified or cv-unqualified (3.9.3) void* shall have the same representation and alignment requirements as a cv-qualified or cv-unqualified char*.

Steve_The_Cynic

@Gąska said in NetBeans vs UTF-8:

And 0 cast to pointer is guaranteed to be null pointer, though it doesn't exactly say what the byte representation should be.

Indeed it doesn't. It's a special behaviour of an integer literal constant whose compile-time value is zero that it becomes the platform's NULL pointer value, if and only if it is used in a pointer context. This causes some fun on exotic platforms (e.g. large memory model 16-bit x86(1) where int is 16 bits and pointers are structured 32-bit values) if you pass the zero to a varargs parameter. The key point, anyway, is that the said NULL pointer value is not constrained to be all-zeroes.

(1) These days it's exotic. It was very common 25+ years ago.

Vixen

@PleegWat said in NetBeans vs UTF-8:

@Vixen

#define tolower(c) ((c)>='A'&&(c)<='Z'?c+0x20:c)

that's......

That's horrifying.....

just..... horrifying....

Steve_The_Cynic

@Gąska said in NetBeans vs UTF-8:

Object pointers aren't guaranteed to round-trip with function pointers, though.

See e.g. compact or medium memory models in 16-bit x86. I don't remember which way round it is (because it's 25 years since I needed to know), but in both cases the two classes of pointer (function and data) are different sizes, either 16-bit offset-only or 16:16 segment-and-offset.

dkf

@Vixen It can get worse.

#define tolower(c) (c^(c>'@'?c<'['?' ':0:0))

dkf

Greater sins would be possible if you know that char is unsigned. "Fortunately", that's rare.

Vixen

@dkf said in NetBeans vs UTF-8:

@Vixen It can get worse.
#define tolower(c) (c^(c>'@'?c<'['?' ':0:0))

#define toLower(c) (while(c) fork();)

wait.... no, that won't work as while is a block statement and i need an expression in there.... hold on, i used to know how to do this.....

umm.... man it's been like.... almost 20 years since i touched C..... memory's rather foggy....

Gąska

@Vixen it works if you run Scala through C preprocessor, though!

PleegWat

@dkf said in NetBeans vs UTF-8:

@Vixen It can get worse.
#define tolower(c) (c^(c>'@'?c<'['?' ':0:0))

#define hexchar(c) ((((c)&0x1F)+9)%25)

dcon

@topspin said in NetBeans vs UTF-8:

but you don't need to write these all that often.

Yeah, I just need to copy/paste them! (actually, include the header I stuck them in). My favorite stringizing macros:

#define STRING2(x) #x
#define STRING(x) STRING2(x)
#define FILE_LINE __FILE__ "(" STRING(__LINE__) ") : "
#define PRAGMA_MESSAGE(x) message( FILE_LINE #x )
#define PRAGMA_TODO(x) message( FILE_LINE "TODO: " #x )
#define PRAGMA_FIXME(x) message( FILE_LINE "FIXME: " #x )

Then I can do:
#pragma PRAGMA_FIXME(you idiot)

topspin

@Gąska said in NetBeans vs UTF-8:

Now I'm digging through C90 to see if it was different back then, or if Cray was simply non-compliant.

My money's on the latter.

Fake edit: you've dug up the references already.

@dkf said in NetBeans vs UTF-8:

It was exactly the other way round, as internally that platform implemented char * as a struct containing a pointer (to machine word) and an offset within the word.

What happened when you converted that char* to void* and passed it to e.g. memcpy et al.? Magic smoke?!

topspin

@dcon said in NetBeans vs UTF-8:

@topspin said in NetBeans vs UTF-8:

but you don't need to write these all that often.

Yeah, I just need to copy/paste them! (actually, include the header I stuck them in).

I've got this file in one project that is basically a macro based code generator. It's like 500 lines of vodoo, of which the first 200 are comments explaining

Don't touch this
Here's how it works, in excruciating detail, if you do touch this.
Don't touch this.

Its interface is just two simple commands, so it's stuffed away to never be looked at, and points 1/3 apply to myself as well.
I'm at the same time proud and ashamed of it.

Gąska

@topspin said in NetBeans vs UTF-8:

@dkf said in NetBeans vs UTF-8:

It was exactly the other way round, as internally that platform implemented char * as a struct containing a pointer (to machine word) and an offset within the word.

What happened when you converted that char* to void* and passed it to e.g. memcpy et al.? Magic smoke?!

Memory was allocated at word granularity, so no, it would all work.

LB_

~~C++~~

@PleegWat said in NetBeans vs UTF-8:

When you want to use sizeof, like in memset(o, 0, sizeof *(o))

template<typename T> void zero_out(T *t) noexcept { std::memset(t, 0, sizeof(T)); }?

EDIT: Oops, my bad, I didn't realize we were talking C

PleegWat

@LB_ C++

AyGeePlus

I am so glad I don't do C or C++ anymore.

dkf

@topspin said in NetBeans vs UTF-8:

@dkf said in NetBeans vs UTF-8:

It was exactly the other way round, as internally that platform implemented char * as a struct containing a pointer (to machine word) and an offset within the word.

What happened when you converted that char* to void* and passed it to e.g. memcpy et al.? Magic smoke?!

I don't know exactly. It wasn't a platform that I could access directly myself. I'm guessing that the vendor was declaring memcpy() with char* arguments? It was a non-C89 platform, so it had some other weird practices too...

Still not as bad as the segmented memory model of the 8086.

dfdub

@Steve_The_Cynic said in NetBeans vs UTF-8:

On AS/400, you see, a NULL pointer is not the all-zeroes bitpattern(1), so if o points to a structure that contains pointers, the memset will not set them to NULL.

Which is why you should always use = {0} instead of memset.

TheCPUWizard

@AyGeePlus said in NetBeans vs UTF-8:

I am so glad I don't do C or C++ anymore.

Modern C++ is quite different than 10 years ago..... If it has been more than 3-5 years since you [not AyGeePlus, specifically] studied C++, your opinion is quite likely different from current facts.

Deadfast

@TheCPUWizard That is very much true. C++17 compared to C++03 is almost a different language. A lot of the basic concepts like threading and filesystem you used to need a separate library or system-level API for are now a part of the standard library.

hungrier

@Deadfast said in NetBeans vs UTF-8:

almost a different language

I've had very little C++ experience outside of university, but from what I've seen, any two random C++ projects look like they're written in a completely different language, regardless of versions of C++.

Tsaukpaetra

@Steve_The_Cynic said in NetBeans vs UTF-8:

One day, you'll have to write code on an AS/400, today called "iSystem". You will abruptly learn a painful lesson about portability.

FUCK THAT NOISE IMMEDIATELY!

error

@Deadfast said in NetBeans vs UTF-8:

We've had these for ages. In fact modern C++ compilers pretty much completely ignore the inline keyword (outside of the one definition rule) and rely on their own assessment of whether inlining the function is beneficial.
The new cool kid on the block is constexpr which guarantees zero runtime cost.

constexpr_real_inline_i_mean_it

topspin

@error said in NetBeans vs UTF-8:

@Deadfast said in NetBeans vs UTF-8:

We've had these for ages. In fact modern C++ compilers pretty much completely ignore the inline keyword (outside of the one definition rule) and rely on their own assessment of whether inlining the function is beneficial.
The new cool kid on the block is constexpr which guarantees zero runtime cost.

~~constexpr_real_inline_i_mean_it~~ consteval

Close enough.

Deadfast

@topspin Actually, what he is looking for is __forceinline. Or __attribute__((always_inline)). Or whatever the hell else the compiler decides on because this is actually a non-standard keyword.

You know when I said C++17 is almost a different language? I did say almost.

Gąska

@Deadfast said in NetBeans vs UTF-8:

@TheCPUWizard That is very much true. C++17 compared to C++03 is almost a different language. A lot of the basic concepts like threading and filesystem you used to need a separate library or system-level API for are now a part of the standard library.

And C++20 will have modules and concepts. It's going to be as big change as C++11.

Deadfast

@Gąska Yes, I can't wait for that one. Essentially copy-pasting files into each other is the one of the biggest remaining issues with C++.

topspin

@Gąska said in NetBeans vs UTF-8:

@Deadfast said in NetBeans vs UTF-8:

@TheCPUWizard That is very much true. C++17 compared to C++03 is almost a different language. A lot of the basic concepts like threading and filesystem you used to need a separate library or system-level API for are now a part of the standard library.

And C++20 will have modules and concepts. It's going to be as big change as C++11.

I'm still unconvinced concepts aren't a big practical joke on how awful syntax they can propose before it being rejected. The claimed "results in better error messages" has also been disputed.

I'm more interested in coroutines, maybe in another decade it'll catch up to C#.

Gąska

@Deadfast said in NetBeans vs UTF-8:

@Gąska Yes, I can't wait for that one. Essentially copy-pasting files into each other is the one of the biggest remaining issues with C++.

Well, it looks like headers aren't going away anytime soon...

C++ Modules Might Be Dead-on-Arrival

@topspin said in NetBeans vs UTF-8:

@Gąska said in NetBeans vs UTF-8:

@Deadfast said in NetBeans vs UTF-8:

@TheCPUWizard That is very much true. C++17 compared to C++03 is almost a different language. A lot of the basic concepts like threading and filesystem you used to need a separate library or system-level API for are now a part of the standard library.

And C++20 will have modules and concepts. It's going to be as big change as C++11.

I'm still unconvinced concepts aren't a big practical joke on how awful syntax they can propose before it being rejected. The claimed "results in better error messages" has also been disputed.

It's not about error messages. It's about having a way to clearly state the requirements for the generic types of templates so you don't have to use billion enable_ifs and hope you've got it right and the compilation will be failing in the exact way you want it to fail (also known as SFINAE).

dfdub

@Gąska said in NetBeans vs UTF-8:

It's not about error messages. It's about having a way to clearly state the requirements for the generic types of templates

@topspin's link addresses that as well.

Regarding modules: Have you read the follow-up post?

Are C++ Modules Dead-on-Arrival?

Gąska

@dfdub said in NetBeans vs UTF-8:

@Gąska said in NetBeans vs UTF-8:

It's not about error messages. It's about having a way to clearly state the requirements for the generic types of templates

@topspin's link addresses that as well.

"Addresses" is too much. It acknowledges their existence and asserts that the functionality is equivalent. It doesn't even try to actually compare the two. It completely ignores the single biggest problem of SFINAE: how batshit insane is the very premise that invalid code is sometimes okay, and how extremely hard it is to learn template metaprogramming because of the complicated rules of what is and isn't allowed to fail in a template and which parts of the template are and aren't being castrated off as a result, and how can the compiler be manipulated to castrate in just the right places so we can achieve conditional compilation.

Regarding modules: Have you read the follow-up post?

Are C++ Modules Dead-on-Arrival?

No. But did now. And it sounds like something they've only just started thinking about and won't be done until ~2024. I mean, it took concepts roughly 10 years from complete proposal paper to being merged into spec.

topspin

@Gąska said in NetBeans vs UTF-8:

It's not only about error messages.

It has been a major selling point, though, at least for library users.

It's about having a way to clearly state the requirements for the generic types of templates so you don't have to use billion enable_ifs and hope you've got it right and the compilation will be failing in the exact way you want it to fail (also known as SFINAE).

As ugly and annoying as all the enable_ifs are, from the examples I've seen concepts (which definitely sound nice in theory) seem even uglier. Also, I wonder if there'll be any kind of tooling to help debug these concepts, because that's definitely missing with the current state of TMP.
Assuming the goal was to make generic programming easier and more usable, I'm not sure they've achieved that.

Gąska

@topspin said in NetBeans vs UTF-8:

@Gąska said in NetBeans vs UTF-8:

It's not only about error messages.

It has been a major selling point, though, at least for library users.

Salesmen completely miss the point, news at 11.

You know what's the most important for me as a library user? To know HOW I can use the library. And concepts allow just that. I don't have to guess or rely on outdated docs what constraints are imposed on the type arguments. I can just see what concepts are required. For all the problems with concepts syntax, it's still infinitely better than what we have now (ie. nothing).

@topspin said in NetBeans vs UTF-8:

It's about having a way to clearly state the requirements for the generic types of templates so you don't have to use billion enable_ifs and hope you've got it right and the compilation will be failing in the exact way you want it to fail (also known as SFINAE).

As ugly and annoying as all the enable_ifs are, from the examples I've seen concepts (which definitely sound nice in theory) seem even uglier.

It's C++. Nobody who's still here cares in the least about syntax - everyone else ran away long time ago. And while concepts are basically its own DSL with completely different syntax from regular C++ - it's still way easier to read than all the enable_ifs.

Also, I wonder if there'll be any kind of tooling to help debug these concepts

Hahaha, good one.

topspin

@Gąska said in NetBeans vs UTF-8:

You know what's the most important for me as a library user? To know HOW I can use the library. And concepts allow just that. I don't have to guess or rely on outdated docs what constraints are imposed on the type arguments. I can just see what concepts are required. For all the problems with concepts syntax, it's still infinitely better than what we have now (ie. nothing).

Assuming the concept is complete (from what I've read constraining things to actually accept only compilable things will be hard) and you can understand it when you can't understand the ~~nothing~~ enable_ifs.

@topspin said in NetBeans vs UTF-8:

It's about having a way to clearly state the requirements for the generic types of templates so you don't have to use billion enable_ifs and hope you've got it right and the compilation will be failing in the exact way you want it to fail (also known as SFINAE).

As ugly and annoying as all the enable_ifs are, from the examples I've seen concepts (which definitely sound nice in theory) seem even uglier.

It's C++. Nobody who's still here cares in the least about syntax - everyone else ran away long time ago. And while concepts are basically its own DSL with completely different syntax from regular C++ - it's still way easier to read than all the enable_ifs.

That's the goal, yeah. I'm not sure it's true.
The idea is certainly sound. I'm very skeptical of the result.

ixvedeusi

@topspin said in NetBeans vs UTF-8:

I'm more interested in coroutines

:WANT.png:

Specifically, what I'd want is the equivalent of Python generators: functions which can "yield" control back to the caller and then the caller can tell them to continue from that point later on, in an efficient and, most importantly, deterministic manner.

Though what I'd want much more is compile-time introspection and metaclasses.

dkf

@ixvedeusi General coroutines are pretty awesome at cleaning up complicated callback-heavy code. They're also really quite tricky inside the compiler. (For example, their optimization passes are one of the less documented parts of LLVM. Documentation by presentation only linked from ancient Twitter posts... that doesn't really count.)

topspin

@dkf said in NetBeans vs UTF-8:

@ixvedeusi General coroutines are pretty awesome at cleaning up complicated callback-heavy code. They're also really quite tricky inside the compiler. (For example, their optimization passes are one of the less documented parts of LLVM. Documentation by presentation only linked from ancient Twitter posts... that doesn't really count.)

I wanted to link to a nice light-weight overview I read some time ago about how they're implemented in clang (not just the LLVM docs I'm not going to read) but I can't find that anymore. Instead, I found this:

boomzilla

@Gąska said in NetBeans vs UTF-8:

@Deadfast said in NetBeans vs UTF-8:

@Gąska Yes, I can't wait for that one. Essentially copy-pasting files into each other is the one of the biggest remaining issues with C++.

Well, it looks like headers aren't going away anytime soon...

C++ Modules Might Be Dead-on-Arrival

@topspin said in NetBeans vs UTF-8:

@Gąska said in NetBeans vs UTF-8:

@Deadfast said in NetBeans vs UTF-8:

@TheCPUWizard That is very much true. C++17 compared to C++03 is almost a different language. A lot of the basic concepts like threading and filesystem you used to need a separate library or system-level API for are now a part of the standard library.

And C++20 will have modules and concepts. It's going to be as big change as C++11.

I'm still unconvinced concepts aren't a big practical joke on how awful syntax they can propose before it being rejected. The claimed "results in better error messages" has also been disputed.

It's not about error messages.

He didn't say it was.

Gąska

@boomzilla said in NetBeans vs UTF-8:

@Gąska said in NetBeans vs UTF-8:

@Deadfast said in NetBeans vs UTF-8:

@Gąska Yes, I can't wait for that one. Essentially copy-pasting files into each other is the one of the biggest remaining issues with C++.

Well, it looks like headers aren't going away anytime soon...

C++ Modules Might Be Dead-on-Arrival

@topspin said in NetBeans vs UTF-8:

@Gąska said in NetBeans vs UTF-8:

@Deadfast said in NetBeans vs UTF-8:

@TheCPUWizard That is very much true. C++17 compared to C++03 is almost a different language. A lot of the basic concepts like threading and filesystem you used to need a separate library or system-level API for are now a part of the standard library.

And C++20 will have modules and concepts. It's going to be as big change as C++11.

I'm still unconvinced concepts aren't a big practical joke on how awful syntax they can propose before it being rejected. The claimed "results in better error messages" has also been disputed.

It's not about error messages.

He didn't say it was.

I didn't say he did. ²

sockpuppet7

@topspin said in NetBeans vs UTF-8:

I assume it refers to the arcane rules that you need when stringizing

The worst compiler I need to support doesn't do stringizing, it can't be that arcane

dkf

@topspin said in NetBeans vs UTF-8:

@dkf said in NetBeans vs UTF-8:

@ixvedeusi General coroutines are pretty awesome at cleaning up complicated callback-heavy code. They're also really quite tricky inside the compiler. (For example, their optimization passes are one of the less documented parts of LLVM. Documentation by presentation only linked from ancient Twitter posts... that doesn't really count.)

I wanted to link to a nice light-weight overview I read some time ago about how they're implemented in clang (not just the LLVM docs I'm not going to read) but I can't find that anymore.

As an overview that link is OK, but that isn't the overview. That's all there is.

Coroutines are tricky because they need state management and a messy transformation to unwind the function state into a family of other functions that implement the various moves between yield points. The result is... complicated and this is an area where what is in Clang and LLVM is really not quite complete unless all you care about is supporting C++. As my interest in the area is in supporting a language with a deep coroutine model (coroutines there correspond to stacks, meaning that yields do not need to all be from the outermost function of the coro; this is an unusual model) the transformations to compile things in LLVM — even with very aggressive inlining — are not simple at all. The result is currently still research-grade compilation and is very difficult programming.

topspin

@dkf while the overview I can't find actually talked about C first before C++, I assume the LLVM implementation is aimed to support C++, which has stackless coroutines.

PleegWat

@levicki I assume, a coroutine which can only transfer control to the main thread (or a different coroutine) in its main function and not while it is executing a function call.

topspin

@levicki said in NetBeans vs UTF-8:

@topspin What the hell is a stackless coroutine?

No wait... probably better if I don't know. I am pretty sure I will never need that.

A stackless coroutine means the compiler transforms your function into a state machine (for yield/continuation points) and only saves the state of local variables into an object (like a closure). Compared to a stackfull coroutine which is almost as expensive as an OS thread because, while manually scheduled, it needs a full stack which is typically at least 1MB.
If I’m not mistaken it’s based on the async/await model.

dkf

@topspin said in NetBeans vs UTF-8:

it needs a full stack which is typically at least 1MB.

That depends very much on what you can prove. If the call graph in the coroutine us known, it may well be possible to determine that a much smaller stack is needed. Making this sort of thing work well is one of the frontiers of modern optimizing compilers.

But then, another such frontier is producing error messages that actually help.

Gąska

@dkf I've read some whitepaper about coroutines recently. The conclusion was, threads on modern systems are so fast that stackful coroutines (aka. green threads) don't make much sense anymore, and the need to allocate stack is just one of the problems that don't apply to stackless coroutines (aka. async functions). And the only downside of a stackless coroutine is that you can't yield from an inner call, which is rarely a problem because a function calling an async function is almost always an async function itself, so you just chain awaits all the way down.

topspin

@dkf said in NetBeans vs UTF-8:

But then, another such frontier is producing error messages that actually help.

ETA: right after we've solved the Halting Problem.

dkf

@topspin Not exactly. The problem is turning the proofs of problems into something comprehensible by humans, and the issue there is that humans and computers understand programs very differently. Computers are quite good at path analysis and never lose track of detail (which can overwhelm them in some cases) whereas humans are much better at the intentions of (the author of) the code. The level of dissonance between these approaches is profound.

topspin

@dkf

Gąska

@dkf takeaway: computers would work so much better if they didn't have to cater to humans.

I'm pretty sure over 90% of computational power in the world is spent on producing text.