That can't be good...



  • I ran my markov chain program today and it crashed with an "out of memory" error almost instantly. So I did the logical thing and debugged it. It got about halfway through reading the (preprocessed) input data and my computer slowed to a halt as it started swapping. I killed the process with ^\ (SIGQUIT), which prints a stack trace before exiting. The stack trace wasn't particularly exciting, but the line it coughed out just before that is, well, special.

    scvg0: inuse: 14757, idle: 512, sys: 15270, released: 0, consumed: 15270 (MB)

    The program had apparently allocated 15 gigabytes of memory before my computer started swapping. My computer has 3.2GB of RAM and 5GB of swap. This program was using all 15 of them.



  • @Ben L. said:

    How many fruit do I have left?

    Just you.

     

    Your program needs a bit of work, I guess.



  • The default behavior that ships with most Linux* distros is to enable VM overcommit, meaning the kernel will allocate as much memory as your user space will allow, which can be significantly more than the sum of the physical RAM and swap. The kernel doesn't actually try to find a place for the memory until you go to read/write from a page. So you can malloc() to your heart's content, just don't try to actually use those addresses or you will cause a shitstorm.



  • @Ben L. said:

    My computer has 3.2GB of RAM and 5GB of swap.

    Yeah, I remember 2005 as well.



  • @morbiuswilters said:

    The default behavior that ships with most Linux* distros is to enable VM overcommit, meaning the kernel will allocate as much memory as your user space will allow, which can be significantly more than the sum of the physical RAM and swap. The kernel doesn't actually try to find a place for the memory [b]until you go to read/write from a page[/b]. So you can malloc() to your heart's content, just don't try to actually use those addresses or you will cause a shitstorm.

    I'm not sure that's totally correct. In managed code then I could see that, but not in C or C++.



  • @skotl said:

    I'm not sure that's totally correct. In managed code then I could see that, but not in C or C++.

    That's how Windows works. It wouldn't surprise me if Linux was the same way. Memory allocation isn't really anything except bookkeeping-- the memory manager just needs to make sure its mouth ain't writing checks its total available swap can't cash.

    I don't know why you think the language would make a difference... the memory manager doesn't know what language the program's in, nor does it care.

    Memory pages are like Heisenberg cats, until a page is actually accessed it doesn't really exist. In any meaningful sense.



  • Windows has two ways of allocating memory, a 'real' alloc, and a 'virtual' alloc.

    With a real allocation, you get a bit of address space and it's got little bits of memory which you can put stuff in. The program knows it's got the memory it wants.

    With a virtual allocation, you get a bit of address space, and hope you get some actual memory to put stuff in later. The program knows it may not get the memory it wants.

    AIUI, the default behaviour of Linux is a mixture of these two ways - the program THINKS it's got the memory it wants, but it may not have.

    As if this wasn't bad enough, if you start using the memory that you've been promised when there isn't enough left, Linux doesn't necessarily kill you; the "OOM killer" will 'randomly' kill processes to free up sufficient memory for you.

    So, a common tweak is to turn off the 'overcommit' feature. (look for vm.overcommit_memory)



  •  In short, the major difference is that Windows won't lie to your process:

    • If a Windows process asks to reserve address space, it gets it now or fails.
    • If a Windows process asks for virtual memory pages, it gets them now or fails.
    • If a Linux process asks for virtual memory pages, it gets them now or thinks it got them now.


  • @blakeyrat said:

    That's how Windows works. It wouldn't surprise me if Linux was the same way. Memory allocation isn't really anything except bookkeeping-- the memory manager just needs to make sure its mouth ain't writing checks its total available swap can't cash.

    I don't know why you think the language would make a difference... the memory manager doesn't know what language the program's in, nor does it care.

    Memory pages are like Heisenberg cats, until a page is actually accessed it doesn't really exist. In any meaningful sense.

    Windows charges all virtual commits against the global commit count, which cannot exceed total VM size. Any malloc equivalent (HeapAlloc) is committing memory. You can call VirtualAlloc to reserve address space, and commit it later, though, but only if you actually bother to do that.

    A committed page is pre-mapped to a global zero page. When you try to modify it first time, you'll get an actual physical page out of zeroed pages list. No matter what, there always be a either physical page, or a page in the page file. In the latter case, the physical page will be obtained by paging-out some other.



  • @skotl said:

    @morbiuswilters said:
    The default behavior that ships with most Linux* distros is to enable VM overcommit, meaning the kernel will allocate as much memory as your user space will allow, which can be significantly more than the sum of the physical RAM and swap. The kernel doesn't actually try to find a place for the memory until you go to read/write from a page. So you can malloc() to your heart's content, just don't try to actually use those addresses or you will cause a shitstorm.

    I'm not sure that's totally correct. In managed code then I could see that, but not in C or C++.

    It has nothing to do with managed/unmanaged code. It has to do with any VM pages allocated. I gave you enough information in the post you could Google it and find out more, just look for "linux vm overcommit" (san quotes).



  • @pscs said:

    As if this wasn't bad enough, if you start using the memory that you've been promised when there isn't enough left, Linux doesn't necessarily kill you; the "OOM killer" will 'randomly' kill processes to free up sufficient memory for you.

    So, a common tweak is to turn off the 'overcommit' feature. (look for vm.overcommit_memory)

     

    The OOM killer is much 'smarter' than you suggest. It checks several things before it kills: the amount of memory a process uses, how long it has been running, if it is privileged, etc. I have in practice never seen it kill the wrong thing when my process(es) were the only one(s) using a lot of memory.



  • @Quincy5 said:

    @pscs said:

    As if this wasn't bad enough, if you start using the memory that you've been promised when there isn't enough left, Linux doesn't necessarily kill you; the "OOM killer" will 'randomly' kill processes to free up sufficient memory for you.

    So, a common tweak is to turn off the 'overcommit' feature. (look for vm.overcommit_memory)

     

    The OOM killer is much 'smarter' than you suggest. It checks several things before it kills: the amount of memory a process uses, how long it has been running, if it is privileged, etc. I have in practice never seen it kill the wrong thing when my process(es) were the only one(s) using a lot of memory.

    I have. Many, many times. I've even see it kill sshd and effectively lock out access to the machine. In fact, I think I've seen it kill the "wrong" process more often than I've seen it kill the process that was clearly in need of killing.

    However, I still don't think it's that big of a deal. The only times I've had it trigger is when there was already something catastrophic going wrong (severe memory leak, out-of-control process, failing hardware..) and I've never thought "stupid OOM killer, how dare you make my life more difficult!" So, eh..


  • Discourse touched me in a no-no place

    @morbiuswilters said:

    I've even see [the OOM killer] kill sshd and effectively lock out access to the machine.
    Why was sshd using so much memory in the first place?



  • @dkf said:

    @morbiuswilters said:
    I've even see [the OOM killer] kill sshd and effectively lock out access to the machine.
    Why was sshd using so much memory in the first place?

    It wasn't. You don't understand how the OOM killer works. It basically throws a dart at a wall with every process represented on it, then snaps the neck of the first process it sees while everyone is distracted trying to see where the dart landed.



  • @morbiuswilters said:

    @dkf said:
    @morbiuswilters said:
    I've even see [the OOM killer] kill sshd and effectively lock out access to the machine.
    Why was sshd using so much memory in the first place?

    It wasn't. You don't understand how the OOM killer works. It basically throws a dart at a wall with every process represented on it, then snaps the neck of the first process it sees while everyone is distracted trying to see where the dart landed.

    I don't usually LOL...



  • @morbiuswilters said:

    It basically throws a dart at a wall with every process represented on it, then snaps the neck of the first process it sees while everyone is distracted trying to see where the dart landed.
     

    So it's more "Out Of Plain Sight Killer"...?


Log in to reply