The 44MB getter



  • I left a copy of my Minecraft server implementation running on a public server for a few days. Today, I downloaded a heap dump and a cpu profile and started to look through it. Here's an unedited copy of the memory graph. (Scroll down to zoom out, drag to pan, etc.)

    Wait, that can't be right.

    ben@loads stuzzd$ go tool pprof stuzzd mem-201207302014.prof
    Adjusting heap profiles for 1-in-524288 sampling rate
    Welcome to pprof!  For help, type 'help'.
    (pprof) top5
    Total: 60.0 MB
        44.5  74.2%  74.2%     44.5  74.2% github.com/Nightgunner5/stuzzd/chunk.(*Chunk).GetBiome
         6.0  10.0%  84.2%      6.0  10.0% runtime/pprof.printStackRecord
         4.0   6.7%  90.8%      4.0   6.7% compress/flate.(*compressor).findMatch
         1.5   2.5%  93.3%      1.5   2.5% github.com/Nightgunner5/stuzzd/storage.SaveAllPlayers
         1.0   1.7%  95.0%      1.0   1.7% net/http.transportRequest.FormFile
    

    Apparently my GetBiome method allocated 74.2% of what the server's heap has in it at the time of this post. 44.5 megabytes allocated by a getter. Let's see what part of the getter did that...

    (pprof) list GetBiome
    Total: 60.0 MB
    ROUTINE ====================== github.com/Nightgunner5/stuzzd/chunk.(*Chunk).GetBiome in /home/ben/go/src/github.com/Nightgunner5/stuzzd/chunk/chunk.go
    44.5 44.5 Total MB (flat / cumulative)
    . . 157: section.Data.Set(x, y, z, data)
    . . 158:
    . . 159: c.dirty()
    . . 160: }
    . . 161:

     .      .  162: func (c *Chunk) GetBiome(x, z int32) protocol.Biome {
    

    44.5 44.5 163: if x>>4 != c.X || z>>4 != c.Z {
    . . 164: panic(fmt.Sprintf("GetBiome() called on chunk %d, %d but should have been called on chunk %d, %d!", c.X, c.Z, x>>4, z>>4))
    . . 165: }
    . . 166:
    . . 167: c.lock.RLock()
    . . 168: defer c.lock.RUnlock()
    . . 169:
    . . 170: return c.Biomes[(z&0xF)<<4|(x&0xF)]
    . . 171: }

     .      .  172: 
     .      .  173: func (c *Chunk) SetBiome(x, z int32, biome protocol.Biome) {
     .      .  174:     if x&gt;&gt;4 != c.X || z&gt;&gt;4 != c.Z {
     .      .  175:             panic(fmt.Sprintf(&quot;SetBiome() called on chunk %d, %d but should have been called on chunk %d, %d!&quot;, c.X, c.Z, x&gt;&gt;4, z&gt;&gt;4))
     .      .  176:     }
    

    Yes, that's right. The line with two bit shifts, two "not equals", and one binary "or" somehow allocated 44.5 megabytes of RAM.

    wat



  • The profiler must be mistaken.
    @Ben L. said:

    return c.Biomes[(z&0xF)<<4|(x&0xF)]

    Does that allocate a new array..? I'm not very familiar with Go syntax.



  • @Xyro said:

    The profiler must be mistaken.
    @Ben L. said:
    return c.Biomes[(z&0xF)<<4|(x&0xF)]

    Does that allocate a new array..? I'm not very familiar with Go syntax.

    Nope, that gets the element at index x + z * 16.



  • I like the panic function.  Some reason it just appeals to me.



  • @Ben L. said:

    Nope, that gets the element at index x + z * 16.

    Oh, duh, I see it now. Boy I've been Java'ing for so long now that my brain misfires when it sees a field with a capital letter.

    Yeah, that's pretty weird. There's no way the profiler could be correct. What happens when you split up the line into multiple statements? And that's heap size, you say? Nothing is being allocated... This is a very interesting bug. Suppose the Go folks would like a gander?



  • That's the first line of the function. Most likely, the profiler just does it by function. So it could be somewhere else in the function that is leaking.



  • @GoatRider said:

    That's the first line of the function. Most likely, the profiler just does it by function. So it could be somewhere else in the function that is leaking.

    And where exactly is the part that leaks? The array access? The mutex that has no allocation code? The -- oh wait that's the only thing in the function.



  • @Ben L. said:

    @GoatRider said:
    That's the first line of the function. Most likely, the profiler just does it by function. So it could be somewhere else in the function that is leaking.

    And where exactly is the part that leaks? The array access? The mutex that has no allocation code? The -- oh wait that's the only thing in the function.

    What about: c.lock.RLock()? I have no idea what this does, but it seems reasonable that whatever locking is going on is allocating some memory. It's been running for a few days, so a few bytes here and there could add up over time.



  • In c++ I could imagine a nightmare where the array[stuff] operator is overloaded with some code which allocates memory to ensure the array-like access never fails ... and nothing ever releases it again ... 

     



  • If it was the c.lock.RLock() statement, wouldn't the profile have flagged that as the issue instead of the function that calls it? And surely that's not the only function that does (otherwise the lock wouldn't be much use).

    Could it maybe possibly be that the panic() is called more often than you think, and it's creating gigantic stack traces or something?

    More likely that the profile is busted.



  • @boomzilla said:

    @Ben L. said:
    @GoatRider said:
    That's the first line of the function. Most likely, the profiler just does it by function. So it could be somewhere else in the function that is leaking.

    And where exactly is the part that leaks? The array access? The mutex that has no allocation code? The -- oh wait that's the only thing in the function.

    What about: c.lock.RLock()? I have no idea what this does, but it seems reasonable that whatever locking is going on is allocating some memory. It's been running for a few days, so a few bytes here and there could add up over time.

    http://golang.org/src/pkg/sync/rwmutex.go?s=847:873#L16



  • @Xyro said:

    Could it maybe possibly be that the panic() is called more often than you think, and it's creating gigantic stack traces or something?

    So the program crashes multiple times during its execution?



  • @Ben L. said:

    So the program crashes multiple times during its execution?

    Oh, so that doesn't get trapped at all? Okay, maybe I should stop posting about a bug in a profile I've never used for a language I've only read about. :X



  • @Xyro said:

    @Ben L. said:
    So the program crashes multiple times during its execution?

    Oh, so that doesn't get trapped at all? Okay, maybe I should stop posting about a bug in a profile I've never used for a language I've only read about. :X

    You can use panic() and recover() internally to ease error handling (only need to have one error check in a group of many nested function calls), but I use it in this file whenever there's a condition that should never happen without serious bugs or memory corruption.


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.