Being smart about pointers - the Korean way


  • area_pol

    If you are a C++ programmer (and are not terrible), you know owning raw pointers are bad. You shouldn't use them for a variety of reasons. Allocating memory and assigning it to a raw pointer is just bad, period. At some point that knowledge reached a certain Korean corporation, and some wise programmer decided to test this rule in action. If only he knew why raw pointers are bad. Behold the never-use-a-raw-pointer implementation:

    typedef HashMap<String, String> Alpha2ToAlpha3Map;
    
    static const Alpha2ToAlpha3Map& createAlpha2ToAlpha3Map()
    {
        Alpha2ToAlpha3Map& map = *new Alpha2ToAlpha3Map;
    
        // fill the map
    
        return map;
    }
    
    static String alpha3ForAlpha2(String aplha2)
    {
        static const Alpha2ToAlpha3Map& map = createAlpha2ToAlpha3Map();
        return aplha2.isEmpty() ? String() : map.get(aplha2);
    }
    

    For those non-C++ people:

    Alpha2ToAlpha3Map& map = *new Alpha2ToAlpha3Map;
    

    This allocates memory and returns a raw pointer, which is immediately dereferenced and assigned to a reference. That clever way the programmer avoided a raw pointer, along with the means of actually freeing the memory. All hail modern programming techniques - coming to TVs and mobile devices around you!



  • It's been a while since I C++-ed. What are raw pointers and what's bad about them?

    Was he supposed to do this:

    Alpha2ToAlpha3Map *map = new Alpha2ToAlpha3Map;
    //...
    return map; // let someone else worry about freeing this memory
    

    Is this bad?


  • FoxDev

    @cartman82 Using raw pointers is just asking for memory leaks



  • It's been awhile for me too. Just curious from all the C++ folks...

    What happens if the parameter (String alpha2) is null? When it hits the alpha2.isEmpty() method.


  • ♿ (Parody)

    @cartman82 said in Being smart about pointers - the Korean way:

    Was he supposed to do this:

    Lucky people with modern C++ compilers can use stuff like unique_ptr or whatever.



  • @NeighborhoodButcher

    While I'm not going to defend that, I think there are a few things that make this not completely terrible.

    The function createAlpha2ToAlpha3Map() is static, so it can only be called from within this translation unit. It's not visible to anyone else. So the damage is limited.

    The one callsite you show places the return into a local static const reference. This means the map is going to live from the point of call to the end of the program, and the cleanup will be done by the OS when it tears down the process. For a map, that's just cleaning memory, so a call to the destructor is redundant.

    It's still awful because if the program throws when adding stuff to the map, the map's memory is leaked (since it wasn't yet captured by the static reference in the caller). Which is why you should capture the result of new in a smart pointer in the first place.

    @cartman82 said in Being smart about pointers - the Korean way:

    What are raw pointers and what's bad about them?

    Raw pointers are just normal pointers. They're called raw to distinguish them from smart pointers, which are a RAII wrapper around a raw pointer that will delete the pointer if it's not null when you exit its scope. Which means that you can write exception safe code without adding try-catch clauses all over the place.

    @cartman82 said in Being smart about pointers - the Korean way:

    Is this bad?

    Yes. What you write and what he did are equivalent. He simply dereferenced the pointer returned by new, which means he can use the dot operator instead of the arrow to access members. What he should have done is:

    static std::unique_ptr<const Alpha2ToAlpha3Map> createAlpha2ToAlpha3Map()
    {
        auto map = std::make_unique<Alpha2ToAlpha3Map>();
    
        // fill the map
    
        return map;
    }
    

    @c62 said in Being smart about pointers - the Korean way:

    What happens if the parameter (String alpha2) is null? When it hits the alpha2.isEmpty() method.

    alpha2 itself can't be null in that sense. The parameter is being passed by copy, so alpha2 is a new object local to the function. C++ lets you pass by copy, by reference or by pointer. Of these, only passing by pointer allows a null argument.


  • area_pol

    @cartman82 If you want a long answer and don't mind my terrible accent, I can link you to my "modern memory management" tutorial on YT. Short story: raw pointers are just asking for leaks and double-free's.



  • Wait, is createAlpha2ToAlpha3Map called from anywhere else in the program? Because if it isn't, this shouldn't cause any leaks unless the DLL gets unloaded and reloaded.



  • @ben_lubar said in Being smart about pointers - the Korean way:

    Because if it isn't, this shouldn't cause any leaks unless the DLL gets unloaded and reloaded.

    Notice the //fill the map comment. That is going to allocate memory. If any of those fail, you leak the map.


  • area_pol

    @Kian Damag isn't exactly limited, because it leaks on every invocation. It doesn't matter that the function is in a single translation unit, since it can be called a crapton of times from wherever by any externally visible API. Relying on the system to clean up the memory is really a bad practice (although a certain programmer here once tried to convince me otherwise; he moved to Python). You'll just leak like crazy. Not calling the destructor is just icing on the cake - if this involved some external resources which should be freed, we'd have hell. Also note that the example call is using a static reference, but other callers might not, so it may leak until you drown.


  • area_pol

    @ben_lubar it will leak simply because the call to new, thus allocating memory. Not to mention inserting elements into it, which take even more memory.



  • @NeighborhoodButcher What do you mean "it can be called a crapton of times"? I agree that an owning raw pointer is no different from an owning reference. Actually, the reference is even worse, since no one would ever think of trying to delete a reference. So the pattern is just asking for leaks.

    But in this particular example, a hundred calls to alpha3ForAlpha2 will only ever call createAlpha2ToAlpha3Map once. The initialization of function static variables is only done once per run, on the first call to the function. So the code you've shown will only create the map once, and won't leak it as long as it doesn't throw. And now I'm wondering what would happen if the initialization of the function static variable fails.


  • area_pol

    @Kian Note that the creation function creates a new map on every invocation. So every caller will just add more memory leaks. It's the second function, which actually holds a static reference and calling it will result in only one leak. Which is one leak too many. I've included it just for reference (pun not intended, because it's bad).



  • @NeighborhoodButcher said in Being smart about pointers - the Korean way:

    @cartman82 If you want a long answer and don't mind my terrible accent, I can link you to my "modern memory management" tutorial on YT. Short story: raw pointers are just asking for leaks and double-free's.

    This?

    Modern Memory Management - Learn Modern C++ – 26:18
    — Code Blacksmith



  • @cartman82 What's that thing in the lower right? Some kind of Dungeons and Dragons monster?



  • @Kian said in Being smart about pointers - the Korean way:

    What he should have done is

    What I (probably) would have done:

    class Alpha2ToAlpha3Map : public HashMap<String, String>
    {
    public:
        Alpha2ToAlpha3Map() { /* Fill the map */ }
    }:
    
    static String alpha3ForAlpha2(String aplha2)
    {
        static const Alpha2ToAlpha3Map map;
        return aplha2.isEmpty() ? String() : map.get(aplha2);
    }
    


  • @dcon There's a small issue with non-POD static objects, related to destruction and threads.

    Basically, suppose you have a detached thread running when main exits. This will trigger the destruction of all static objects, but the detached threads will still be running while this teardown takes place. The detached threads may attempt to access the static object, and if this happens after it has been destroyed, you hit UB.

    There are two solutions to that problem. The zeroth solution is don't detach threads. That way they're always joined before main exits and everyone is happy.

    The first solution is for the static to not be the object itself, but instead a pointer to the object. Then the object is not destroyed when the program is being torn down, and the detached threads can access it until the OS kills them. No UB, but purists complain that that's not a clean shutdown because you didn't call the destructor for the object. You have a "leak".

    The second solution is to have a shared_ptr for any static object with a destructor, and return copies of the shared_ptr. That way your threads will hold a reference, and when main exits it will decrease the count by one but only destroy the object if no threads have a hold on them. No UB, but performance freaks will object that you are adding the cost of incrementing the count every time you make a thread to make purists happy.



  • @Kian Real life is multithreaded. Forum examples aren't. :)



  • @cartman82 said in Being smart about pointers - the Korean way:

    @cartman82 If you want a long answer and don't mind my terrible accent, I can link you to my "modern memory management" tutorial on YT. Short story: raw pointers are just asking for leaks and double-free's.

    Ok, I watched the video. The accent wasn't too bad.

    Very interesting. Last time I did C++ was back in school, and we only did "raw" pointers, which were at the time called just "pointers".

    I had no idea that beards like that existed.

    RAII! I meant, I had no idea that RAII existed.


  • Notification Spam Recipient

    @cartman82 said in Being smart about pointers - the Korean way:

    Last time I did C++ was back in school

    As I'm looking to get into HoloLens development, looks like I'm going to be dusting off my internal archive on it "soon".
    Apparently, lots of game and simulation programming uses C++ (or so I'm told).

    Maybe things just aren't progressing as quickly in the gaming world? 🚎



  • @Tsaukpaetra said in Being smart about pointers - the Korean way:

    Apparently, lots of game and simulation programming uses C++ (or so I'm told).
    Maybe things just aren't progressing as quickly in the gaming world?

    It's not so much that as that game programming is very timing and performance intensive, and so people only want to do it in C++. But Hololens apparently works with Unity as well, if you'd rather not go into C++land. Monogame 'works' in that it can be in a window successfully, but I don't know about setting up an immersive environment with it. I really need to look around more.


  • Winner of the 2016 Presidential Election

    @Kian said in Being smart about pointers - the Korean way:

    Which is why you should capture the result of new in a smart pointer

    Let me stop you right there: If you write new, you're already Doing It Wrong™.



  • @asdf Which is why when I wrote it, I called std::make_unique.



  • @asdf said in Being smart about pointers - the Korean way:

    @Kian said in Being smart about pointers - the Korean way:

    Which is why you should capture the result of new in a smart pointer

    Let me stop you right there: If you write new, you're already Doing It Wrong™.

    This is what I was thinking.

    Is there any reason this isn't feasible (I mean, a reason besides incompetence)?

    typedef HashMap<String, String> Alpha2ToAlpha3Map;
    
    static const Alpha2ToAlpha3Map createAlpha2ToAlpha3Map()
    {
        Alpha2ToAlpha3Map map;
    
        // fill the map
    
        return map;
    }
    
    static String alpha3ForAlpha2(String aplha2)
    {
        static const Alpha2ToAlpha3Map map = createAlpha2ToAlpha3Map();
        return aplha2.isEmpty() ? String() : map.get(aplha2);
    }
    

  • Winner of the 2016 Presidential Election

    @MZH Not really, apart from what @Kian mentioned above (if applicable).


  • Discourse touched me in a no-no place

    @NeighborhoodButcher said in Being smart about pointers - the Korean way:

    Relying on the system to clean up the memory is really a bad practice (although a certain programmer here once tried to convince me otherwise; he moved to Python).

    It depends on whether the resource is really just memory, or backed with a handle that the OS can deal with on its own, or if it is actually something expensive and nasty like an Oracle database connection (:vomit:). This is because everything that the OS can dispose of for you when the process terminates is sometimes better cleaned that way, since that can be done without potentially paging the memory back in. It doesn't work for everything, of course, but it works very well for a substantial fraction.

    Destroying everything bit by bit can be surprisingly expensive. BTDT, fielded user bug reports over it…



  • @dkf I remember Raymond Chen's rant about that. Firefox used to take forever to close, so I used to nuke the process instead of closing it cleanly.

    It's a pretty hard problem to code in C++, though. "I want my destructor to always be called when my object goes out of scope. Uh, except for when the program is exiting and the only thing my object owns is memory."


  • Discourse touched me in a no-no place

    @NedFodder said in Being smart about pointers - the Korean way:

    It's a pretty hard problem to code in C++, though. "I want my destructor to always be called when my object goes out of scope. Uh, except for when the program is exiting and the only thing my object owns is memory."

    It's not so much the objects that are bound to small scopes that are the problem; it's usually rare for them to be shuffled off this mortal coil to disk. It's large global objects that need to be around but which are mostly not being used very much at exit time that can be costly (as they can often end up partially paged out). Windows is especially aggressive at doing this (and I'm not criticising it at all for that decision). In the end, in the application that I was talking about we split the recover critical resources and the recover all memory into separate systems (since there are some use cases for it, such as where the code is being unloaded from a larger binary). Critical resource cleanup was handled by independent exit handlers that would always fire on any kind of controlled shutdown, whereas full resource recovery was kept for when specifically requested. Which isn't perfect, oh no, but seems about the best compromise available.


  • ♿ (Parody)

    @cartman82 said in Being smart about pointers - the Korean way:

    Ok, I watched the video. The accent wasn't too bad.

    I was cool with the accent but the brace style destroyed all credibility.



  • @boomzilla said in Being smart about pointers - the Korean way:

    I was cool with the accent but the brace style destroyed all credibility.

    That was beard, not braces.


  • area_pol

    @blakeyrat said in Being smart about pointers - the Korean way:

    @cartman82 What's that thing in the lower right? Some kind of Dungeons and Dragons monster?

    I found Viking appearance makes all your code review arguments valid.

    @boomzilla said in Being smart about pointers - the Korean way:

    @cartman82 said in Being smart about pointers - the Korean way:

    Ok, I watched the video. The accent wasn't too bad.

    I was cool with the accent but the brace style destroyed all credibility.

    It's the only valid one 🚎

    @cartman82 said in Being smart about pointers - the Korean way:

    That was beard, not braces.

    Going full Gandalf with this one.


  • Discourse touched me in a no-no place

    @NeighborhoodButcher said in Being smart about pointers - the Korean way:

    I found Viking appearance makes all your code review arguments valid.

    QFT

    The effect is redoubled when the reviewer has a throwing axe with them.


Log in to reply