The abhorrent 🔥 rites of C


  • Winner of the 2016 Presidential Election

    Imagine something like Hadoop: A framework for crunching large amounts of data efficiently. If you write something like that in Java, the garbage collector will become a huge performance problem and you need to avoid GC as much as possible.

    @DogsB said:

    I imagine anything that necessiates those libraries is great :wtf: in the making.

    Trust me: In this case, you're wrong.

    @DogsB said:

    You really should share those stories.

    Don't worry: If something can be anonymized easily and is a :wtf:, I will share it.


  • Notification Spam Recipient

    @asdf said:

    @DogsB said:
    I imagine anything that necessiates those libraries is great :wtf: in the making.

    Trust me: In this case, you're wrong.

    @DogsB said:

    You really should share those stories.

    Don't worry: If something can be anonymized easily and is a :wtf:, I will share it.


    ]Ohhhh what will I do with you. You're such a tease. You always drop hints but never tell. I hope one day you do a thread a la @Arantor but with less misery and a happy ending. :)


  • Winner of the 2016 Presidential Election

    @DogsB said:

    I hope one day you do a thread a la @Arantor but with less misery and a happy ending.

    Maybe, who knows? ;)



  • @Captain said:

    C++ looks pretty easy. Why do so many people complain about it?

    Because we all experienced it in the '90s.

    I still say it doesn't look easy, but it looks infinitely better than what literally everybody my age thinks of when they hear the name "C++". You know, the Borland Builder disaster we all learned in college.


  • area_pol

    @blakeyrat said:

    You know, the Borland Builder disaster we all learned in college.

    That abomination was a failure of epic proportions. Fortunately, those dark ages are over. Now even Microsoft got their shit right and VS is a decent C++ compiler. Although I don't know if they fixed their broken template handling.



  • @LaoC said:

    Most C code I have to deal with scales way better than the Java stuff.

    Well, either you have pristine C code, or shitty Java code.

    Seriously...

    I have some code from before the stl. OMG what they did with memory? They didn't like releasing memory, so they kept the pointers and the block sizes in a dump field, and if they ever needed memory and there was something in the field that fit, they used it.

    Of course, everything is public and static, so when they gave you a pointer, the method had hazard warnings insisting you were given free access to the vault and don't fuck it up.

    @LaoC said:

    It's so simple in C# that Microsoft had to say this about the upgrade to .NET 2.0:

    It took them a bit to improve, burn them with fire.

    Yeah, I get why you are a socialist. C# has problems, so let's use C...


  • Winner of the 2016 Presidential Election

    @NeighborhoodButcher said:

    Although I don't know if they fixed their broken template handling.

    I think so, they fixed a lot of old bugs in VS2013 and VS2015. For example, value initialization now finally works as defined in the standard.


  • area_pol

    Nope - still broken:

    #include <vector>
    using namespace std;
    
    template<class T>
    struct A : vector<T>
    {
        using smth = value_type;
    };
    
    int main()
    {
        // your code goes here
        return 0;
    }
    

    Compiles just fine on VS2015.



  • I'd still never use C++ over C# for literally anything.

    They really should have at some point, like C++11, renamed the language. Call it CPopYo or something. Because the other problem is right now, if you just search for C++ examples, you get all the shitty ones from 10 years ago.


  • Winner of the 2016 Presidential Election

    You can compile that with clang as well (-fms-compatibility), which will produce the following warning:

    warning: use of identifier 'value_type' found via unqualified lookup into dependent bases of class templates is a Microsoft extension [-Wmicrosoft]
    

  • area_pol

    Yeah, they really messed two-phase name lookup in VS. They treat templates not much different than macros.



  • @asdf said:

    shared_ptr already makes memory management pretty easy, and the memory is freed immediately when the last reference goes out of scope

    Reference counting is fairly primitive and tricky to get right in non-trivial software (destruction becomes more or less non-deterministic, any reference loop will create a leak). Other forms of GC are significantly easier to use. My advice for shared_ptr is always "don't, unless you really really know that you need it". You should be fine with unique_ptr most of the time.

    @NeighborhoodButcher said:

    Now even Microsoft got their shit right and VS is a decent C++ compiler.

    Haha, no.


  • Winner of the 2016 Presidential Election

    @CatPlusPlus said:

    My advice for shared_ptr is always "don't, unless you really really know that you need it". You should be fine with unique_ptr most of the time.

    You're preaching to the choir.


  • Discourse touched me in a no-no place

    @CatPlusPlus said:

    Reference counting is fairly primitive and tricky to get right in non-trivial software (destruction becomes more or less non-deterministic, any reference loop will create a leak).

    Refcounting works fine if you can guarantee that there are no recursive structures (or at least not of reference ownership; weak references/pointers don't matter). If you've got recursive structures, you're in a world of pain.

    @CatPlusPlus said:

    Other forms of GC are significantly easier to use.

    But usually have the downsides that they make memory consumption much higher and the time of release more uncertain.


  • area_pol

    @CatPlusPlus said:

    Haha, no.

    They support C++14 and have full (I think) standard library support done. I say that's quite decent, despite their template fiasco. The only compiler I can think of with better support is clang. gcc has been shit for quite some time now and libstdc++ is a joke.


  • Discourse touched me in a no-no place

    @NeighborhoodButcher said:

    The only compiler I can think of with better support is clang.

    I think I ran across something which clang didn't support a few months back. I was like “wow, they've not done that yet?” but I don't remember exactly what it was. Had an easy workaround for what I was actually doing so I didn't put effort into remembering the details. 😉


  • area_pol

    @dkf said:

    I think I ran across something which clang didn't support a few months back.

    I remember when I used their exception_ptr implementation way ago. Built just fine; ran it and "abort: exception_ptr is not yet implemented".



  • There's a Minecraft map renderer application floating around (I forget the name, can look it up later if you insist) that would almost always throw some out-of-memory error if you ran it over anything other than a tiny chunk of a Minecraft world, up until some version of the JVM where presumably the GC was improved. Also Minecraft generally runs like shit on everything, partially because it triggers the GC several times a frame. Also, some C# games have very noticeable GC-related stuttering. Magicka is the most obvious example, but like the Minecraft devs, they're not exactly the most competent programmers ever.

    @dkf said:

    The performance problems have been largely sorted out for the past 10 years or so, but in the early days Java used a stop-the-world GC algorithm and that really did have performance issues in practice.

    Isn't it still a mark-and-sweep generational thing, which has to stop the world? I'm having trouble coming up with a GC algorithm that doesn't have to stop the world and isn't reference counting + deterministic destruction.


  • Discourse touched me in a no-no place

    @jmp said:

    Isn't it still a mark-and-sweep generational thing, which has to stop the world?

    There's a multi-stage generational collector. When the youngest generation gets full, only that is collected (which is done in parallel I think) and anything that survives transfers to the older generations. That works pretty well I believe, as most objects in Java have a rather short lifespan. I'm not sure how many layers of GC there are these days (since this has changed a bit over the years) but the full stop-the-world GC algorithm is only run when the alternative is entirely running out of memory.

    I'm definitely not an expert on GC. I've seen some of the stuff that compilers and JIT engines do to support GC, and it's both scary and really complicated. I'd love to find explanations of how it all works in LLVM. Also the exception handling. That's just as bad. (It's also similarly badly documented; the authors of both parts assume that you already know exactly what they're doing, so the system docs don't provide much helpful info if “copy our C++ implementation” and “copy our Objective-C implementation” are distasteful…)


  • Winner of the 2016 Presidential Election

    @dkf said:

    I'm not sure how many layers of GC there are these days (since this has changed a bit over the years)

    The current garbage collector in the HotSpot VM has 2 generations.

    Some more information: https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/toc.html

    @dkf said:

    but the full stop-the-world GC algorithm is only run when the alternative is entirely running out of memory

    Well, you'll still have short pauses when the GC runs.


  • Discourse touched me in a no-no place

    @asdf said:

    Well, you'll still have short pauses when the GC runs.

    Not necessarily. This stuff gets really complicated.



  • @dkf said:

    (which is done in parallel I think)

    It has [url=https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/collectors.html#sthref27]several GC algorithms[/url], but they all have to pause the application at some point, regardless of generation being collected.

    Edit: "parallel" there refers to collection process being multi-threaded, not pauselessness.



  • If you're talking about Java, Java 7 supposed had five generations (3 normal, 2 speciality) and Java 8 dropped the permanent generation, so presumably four (3 normal, 1 specialty) these days.



  • @jmp said:

    I'm having trouble coming up with a GC algorithm that doesn't have to stop the world and isn't reference counting + deterministic destruction.

    Baker's Treadmill fits that. http://www.pipeline.com/~hbaker1/NoMotionGC.html



  • I'll have to go over that at home when I've got the time to figure out what the hell they're saying.

    Thus, it would appear that if the allocation problem of large immobile objects of different sizes could be solved, then our in-place real-time variant would be an attractive way to collect garbage.

    is a pretty amusing way for it to end, though. Although of course they're talking about real-time systems, things that don't have to be real-time it's much less of a concern, and allocators have gotten better since 1991.



  • @jmp said:

    I'm having trouble coming up with a GC algorithm that doesn't have to stop the world and isn't reference counting + deterministic destruction.

    Well I guess static code analysis could possibly reduce the number of "world" collections if coupled with a "thread-local" heap & GC that would be used for objects that the analysis has determined never to leave the thread that created them. It would allow for thread-specific collections that do not stop the other threads...



  • @jmp said:

    There's a Minecraft map renderer application floating around (I forget the name, can look it up later if you insist) that would almost always throw some out-of-memory error if you ran it over anything other than a tiny chunk of a Minecraft world, up until some version of the JVM where presumably the GC was improved.

    Was is fixed by Java 7? If so, it wouldn't surprise me if it was a perm-gen error caused by interning too many strings.

    @jmp said:

    some C# games have very noticeable GC-related stuttering. Magicka is the most obvious example, but like the Minecraft devs, they're not exactly the most competent programmers ever.

    I don't remember magicka having any GC-related stuttering, but I do remember it having the shittiest netcode that ever existed, so it probably got massively overshadowed by that.
    It would try and send huge amounts of data across the network (From memory it was hundreds of kb/s), resulting in players teleporting about like crazy and falling through floors.


  • Discourse touched me in a no-no place

    @powerlord said:

    Java 8 dropped the permanent generation

    That was a good thing. It made unloading code work much better, as previously everything that hit the JIT went to the PermGen. Which really sucked in application servers…


  • Discourse touched me in a no-no place

    @Salamander said:

    interning

    Just Say No.

    It turns out to be a poor technique (with experience across many systems) and there are other ways to accelerate comparisons that don't come with a built-in hard-to-handle memory leak. I guess I can live with interning compile-time constants a bit. But that ought to be the compiler in charge of that, not the programmer. Programmers shouldn't intern anything.



  • Supposedly they fixed the memory leak issue in 7 by moving the intern cache into the heap instead of perm gen, but I've never really seen a reason for manually calling it.
    I either already know what string I am dealing with because I compared it to a literal earlier, or I don't know what it is and storing unknown values in application-wide cache just sounds like a terrible idea.


  • Discourse touched me in a no-no place

    @Salamander said:

    I've never really seen a reason for manually calling it.

    Every time I see someone using it, I see someone who is deliberately adding a memory leak. Really. Just Don't.



  • @dkf said:

    @powerlord said:
    Java 8 dropped the permanent generation

    That was a good thing. It made unloading code work much better, as previously everything that hit the JIT went to the PermGen. Which really sucked in application servers…

    I'm aware of the problems with permgen. Heck, we had to up the permgen space on our app servers after running into PermGen space issues with it. Hell, Oracle App Server 10 would hit PermGen space issues just with redeploying the application on our development server.


  • Discourse touched me in a no-no place

    Yep. I hit that sort of thing quite a few times. As far as I can tell, the major cause of the trouble is some sort of interaction between the application server's thread pools and the use of thread-local storage internally in a number of standard libraries. Unfortunately, I've not yet had a chance to confirm (or deny) that that's actually the case. There were a lot of interacting pieces (at least one of which was made of :wtf: and Fail) which made upgrading that particular application rather difficult…



  • Okay so having thought about it, ignore the treadmill optimization, that's not how it avoids stopping-the-world - the key trick is hooking all reads to pointers so that the marking thread can mark the pointer before it's read. This is constant-time, because you only ever need to mark one thing and you know exactly where it is. Unfortunately, 'stop the world' is not exactly well-defined, but it doesn't have to pause all threads and it doesn't have to pause them for time proportional to the amount of garbage, so it'd have to count under any sensible definition.

    I'm not sure making all reads to a reference require going through a lock is a good tradeoff, though. 😛

    @Salamander said:

    Was is fixed by Java 7? If so, it wouldn't surprise me if it was a perm-gen error caused by interning too many strings.

    Program is Tectonicus, IIRC it was resolved by Java 7, here's the github issue. I think there was a second memory issue as well. I do vaguely recall something about permgen being relevant.

    @Salamander said:

    I don't remember magicka having any GC-related stuttering, but I do remember it having the shittiest netcode that ever existed, so it probably got massively overshadowed by that.It would try and send huge amounts of data across the network (From memory it was hundreds of kb/s), resulting in players teleporting about like crazy and falling through floors.

    Also crashes. Definitely stuttered, though.


  • Discourse touched me in a no-no place

    @jmp said:

    Okay so having thought about it, ignore the treadmill optimization, that's not how it avoids stopping-the-world - the key trick is hooking all reads to pointers so that the marking thread can mark the pointer before it's read.

    I think this is approximately how compilers handle the youngest generation of objects, as they can exactly define when they are holding onto references to objects and when they release them. JIT engines can do particularly well at this, as they can see the entire concrete class graph. I've observed the support in LLVM for all this sort of thing, but it's rather less well documented than I want (the docs tell me what the relevant intrinsics do, but not why this is significant, and I've yet to find a comprehensible paper on the topic; I haven't been looking very hard to be fair).

    The upshot is that the majority of collections are of the youngest generation, and that's very quick. (I can't remember if it is per-thread; if it is, it will be particularly easy to find a time to do it.) I think the cost is actually linear in the amount of non-garbage remaining that has to transfer to the older generation pools; those don't need collecting nearly so often in most programs.



  • @jmp said:

    Program is Tectonicus, IIRC it was resolved by Java 7, here's the github issue. I think there was a second memory issue as well. I do vaguely recall something about permgen being relevant.

    Following that exception message around, and apparently there's a buffer implementation (ByteBuffer.allocateDirect) in the standard java library that uses native memory rather than the heap, and can only be unallocated by the garbage collector running finalize() on it.
    AKA non-deterministic cleanup of native resources.
    WTF, Java?



  • @dkf said:

    I think the cost is actually linear in the amount of non-garbage remaining that has to transfer to the older generation pools; those don't need collecting nearly so often in most programs.

    Something far too many people ignore...and something that is so easy to prove by empirical measurement.



  • @jmp said:

    Magicka is the most obvious example, but like the Minecraft devs, they're not exactly the most competent programmers ever.
    SPACE



  • @powerlord said:

    @Mason_Wheeler said:
    Counterexample: The for loop. As everyone knows, a for loop is a reduced-boilerplate special-case of a while loop that is common enough to have its own keyword and its own semantics.

    For that matter, while is a special-case of if and goto.

    Or if you want to go to a lower level, a CMP followed by a JE.

    I once (1990-1992) worked with a C compiler that had a somewhat feeble optimiser.

    The target was an embedded platform without a file system, running on an 8088. The system included a fixed set of what we'd today call "threads", and each one featured, reasonably enough, an infinite loop at its heart.
    [code] while ( 1 )
    {
    function_call();
    call_function();
    maybe_just_gosub_something();
    }[/code]

    With the compiler set to its "most aggressive" optimisation settings, the assembler output looked loosely like this:
    [code] jmp bottom
    top:
    call _function_call
    call _call_function
    call _maybe_just_gosub_something
    bottom:
    mov ax,1
    cmp ax,0
    jne top[/code]

    Question: How do you tell Dicksucks to make part of a CODE block bold?

    EDIT: Forgot to mention: The version of the compiler that we had was more than five years old. A telephone call to the company that produced it revealed that (a) they were still selling it and (b) it was the latest version. We switched to a different compiler.


  • Considered Harmful

    @Steve_The_Cynic said:

    I once (1990-1992) worked with a C compiler that had a somewhat feeble optimiser.
    [...]

    [code] jmp bottom
    mov ax,1
    cmp ax,0
    jne top[/code]
    SAS/C on the Amiga definitely optimized shit like that away at the time. It must have been around that time that I understood the difference between a global and a peephole optimizer and how the latter is basically just stupid pattern matching but good enough for this kind of inefficiency.

    Actually I got bitten by it in my very first steps in C, trying to do what I'd been doing in assembler before, namely busy-looping on a mouse button register. I wrote
    [code]while((1<<6) && ((short)0xbfe001)) /* wait */ ;[/code]
    And the compiler turned it into
    [code] move.w $00bfe001,d0
    foo:
    btst #6,d0
    bne.s foo[/code]



  • Ah, forgot volatile? Had the same problem too back as a student, trying to busy-wait on a variable assigned in an interrupt handler.


  • Considered Harmful

    @Medinoc said:

    Ah, forgot volatile? Had the same problem too back as a student, trying to busy-wait on a variable assigned in an interrupt handler.

    Yup, volatile fixed the code and my opinion of the intelligence built into compilers.



  • The compiler was plenty intelligent - it observed that you were repeatedly reading from the same pointer and that nothing was writing to that pointer, and therefore the value would never change.

    How was it to know it was a hardware interrupt register that could change under it? Telling the compiler that is your job.


  • Considered Harmful

    @jmp said:

    The compiler was plenty intelligent

    Sure, I meant it changed my opinion in a positive way.



  • @Medinoc said:

    @NeighborhoodButcher said:
    What’s worse than C is the mentality in many (most?) C programmers. They think code should be clever, they think the more bizarre way of doing things, the better.

    I admit I used to be like that, back in my learning years. I know the old obsession with cleverness matched only by the obsession with speed, in that "penny-wise, pound-foolish" way that believes concise is fast and makes code unreadable for a negligible performance increase, if not a worse performance.

    A simple like doesn't suffice. You are me and I believe you are also a lot of people in the field.



  • @tufty said:

    if you can think of an optimisation, STALIN can do it

    Like collectivisation!

    No, I have nothing else to add to this topic and TBH I'm not really sure why I bothered catching up on it.



  • @NeighborhoodButcher said:

    Tizen

    after you guys chose enlightenment, and reading that great topic of yours, I don't feel like taking Tizen's choices on tooling as a good idea



  • Another thing I find weird about C, it's how people always think arrays are pointers... and in the case of function parameters, they're right:

    /*
    Array's actual type test (Visual C++ version)
    Test result: The same in both compilers:
        1) The parameter array is of type "pointer to int",
        and attempting to assign it to an array pointer yields a warning.
        2) The local array is of type "array of 20 ints",
        and attempting to assign it to an int** yields a warning.
    */
    void TestArrayC(int arrParam[20])
    {
        int localArr[20];
    
        int **pp;
        int (*pArr)[20];
    
        pp = &arrParam;
        /*pArr = &arrParam;*/ /* warning C4047: '=' : 'int (*)[20]' differs in levels of indirection from 'int **' */
    
        /*pp = &localArr;*/ /* warning C4047: '=' : 'int **' differs in levels of indirection from 'int (*)[20]' */
        pArr = &localArr;
    
        (void)pp;
        (void)pArr;
        (void)arrParam;
    }
    

    PS: How do I discospecify a language for my code samples? Found it, only had to use C++ highlight because there's no C highlight.

    Edit: Wow, discoindent... It somehow fixed itself.


  • Discourse touched me in a no-no place

    @Medinoc said:

    It somehow fixed itself.

    Discofixed: the new definition!


  • Discourse touched me in a no-no place

    @dkf said in The abhorrent 🔥 rites of C:

    The major problems with C++ as an ecosystem are:

    • The language has changed quite a bit recently, which can lead to problems with the language runtime (which is quite a bit thicker than for C) not matching up with the version of the language understood. This shouldn't happen, but it does and it is an entirely mysterious failure mode when it occurs.
    • The language is less encouraging of a stable ABI, since use of inlined functions/member-methods and templates really binds the particular version of the library into the build of the consuming code. Exceptions might also have an impact here; that's not something I've studied in great depth.

    Also, C++ is very very good at setting out hidden pitfalls. Especially when bool is invited to the party (or just allowed to gatecrash it like a boorish senior politician in a sorority house).


Log in to reply