Valgrind



  • Today I was dealing with memory corruption in one of our data processing binaries. A quick perusal of the coredump quickly revealed a likely source of the problem: Something in the code was performing use-after-release-to-homebrewed-memory-pool.

    Now I've been working in a different section of the code recently, and on linux this type of problem has a very straightforward approach: Release valgrind on it.

    However, I'd neglected to remember this bit connects to an oracle database. Not to write any data to it (we do that further on in the pipeline). No, just to log some statistics.

    Several thousand lines of valgrind errors later, I ended up having to do my debugging in another fashion...



  • I wish I knew enough Linux to make a proper less is more joke.



  • alias less=more



  • Oh, I always redirect valgrind output to file. Because you don't know how much you're going to get and whether the bottom will be more interesting than the top. When checking for errors, usually the top is most interesting anyway.

    Little works between thousands of errors from a library though. Ended up going for the tried-and-trusted method of 'which code got touched recently' and found out I added a utility function regarding doubly linked lists where I forgot to clear a prev pointer.



  • @PleegWat said:

    Oh, I always redirect valgrind output to file.

    Yes, and then use less to find interesting stuff.



  • Now...I wonder how much it'd cost to send Larry Ellison some coal for Christmas...lots of coal ;)



  • @boomzilla said:

    Yes, and then use less to find interesting stuff.

    Or grep?



  • That's why his mansion has a boiler in the basement. BRING IT ON!



  • Or find out how to run this thing without getting the database involved. There's support for if the DB craps out halfway through but trying to trigger that on startup leads to failed startup.

    I don't have the spare time to investigate that kinda stuff thoroughly. I've got plenty of assign-to-self bugs marking desired cleanup projects already that I'm not getting round to.



  • @chubertdev said:

    I wish I knew enough Linux to make a proper less is more joke.

    On typical Linux box, /bin/sh is symlinked to /bin/bash. But bash detects if you called it via bash or sh command and adjusts its behavior accordingly - ie. even though you ran the same script via the same interpreter, you might get different results. I wonder what happens if you symlink /bin/less to /bin/more.



  • @Gaska said:

    On typical Linux box, /bin/sh is symlinked to /bin/bash. But bash detects if you called it via bash or sh command and adjusts its behavior accordingly - ie. even though you ran the same script via the same interpreter, you might get different results. I wonder what happens if you symlink /bin/less to /bin/more.

    That wasn't funny at all.



  • @tar said:

    Or grep?

    Nah. Easier to just keep searching with less, IME.

    @PleegWat said:

    Or find out how to run this thing without getting the database involved.

    Fortunately, it's been a while, but ISTR that you can tell valgrind to ignore parts of your program so you wouldn't be bothered about Oracle.

    Looking at recent code is almost always a winning solution, however, when something is newly broken.



  • @blakeyrat said:

    That's why his mansion has a boiler in the basement. BRING IT ON!

    laughs Nothing some shoofly and a unit train couldn't fix...

    Filed under: demurrage charges



  • @chubertdev said:

    That wasn't funny at all.

    It is if you consider bash behavior a giant WTF.



  • @boomzilla said:

    Nah. Easier to just keep searching with less, IME.

    The / command in less might just be your friend :smile:

    @boomzilla said:

    Fortunately, it's been a while, but ISTR that you can tell valgrind to ignore parts of your program so you wouldn't be bothered about Oracle.

    Yep, suppressions are the OP's friend in this case -- it sounds like he'll need plenty of them, though, with the way the OCI libraries are spewing errors.

    Filed under: fix yo shizzle, Orrible!



  • @tarunik said:

    The / command in less might just be your friend

    Exactly. ?, too.



  • I had a working set of suppressions a couple years back.

    Then we switched to the centralized dev servers and corporate homegrown VCS and the library paths are changing daily and the suppressions don't work.

    It's a conspiracy I'm telling you.



  • Well, as they say in Linux, it's easier to port a shell than a shell script.



  • @PleegWat said:

    centralized dev servers

    :rolleyes: Do you mean 'everyone has their own box, just they're all sitting next to each other in the datacenter', or do you mean 'sharing development servers?" Because the latter is :wtf:

    @PleegWat said:

    corporate homegrown VCS

    :headdesk:

    @PleegWat said:

    the library paths are changing daily

    :headdesk: :headdesk: :headdesk: :headdesk:
    :headdesk: :headdesk: :headdesk: :headdesk:

    :headdesk: :headdesk: :headdesk: :headdesk: | :headdesk:
    :headdesk: :headdesk: :headdesk: :headdesk: | :headdesk:

    :headdesk: :headdesk: :headdesk: :headdesk:
    :headdesk: :headdesk: :headdesk: :headdesk:

    Even the pilots are :headdesk: at this level of :wtf:



  • @tarunik said:

    Do you mean 'everyone has their own box, just they're all sitting next to each other in the datacenter'

    That, all virtualized. We used to have that in the local server room, but this is centrally managed. Does simplify some things and these are more powerful.

    @tarunik said:

    @PleegWat said:
    corporate homegrown VCS

    :headdesk:

    Well, it dates from the mid nineties at the latest. It's kinda overdesigned for our usecase - it's integrated vcs/build infrastructure intended for products that take hours to build; I've heard rumours one product would take 26 hours at some point in the past.

    We came from CVS. In some aspects this is a step forwards. In others it's a step back. This is better at merging. With CVS we had a web history browser, which was faster and easier to use than the CLI commands we have to make do with now.



  • As a reminder, valgrind isn't perfect. Don't fuck up working code just based on what it says.

    ...then again, this was OpenSSL they were talking about, so "working code" should be taken with a grain of salt.



  • @Gaska said:

    On typical Linux box, /bin/sh is symlinked to /bin/bash. But bash detects if you called it via bash or sh command and adjusts its behavior accordingly - ie. even though you ran the same script via the same interpreter, you might get different results. I wonder what happens if you symlink /bin/less to /bin/more.

    Worst. Linux. Joke. Ever. ;<wink>P



  • @powerlord said:

    As a reminder, valgrind isn't perfect. Don't fuck up working code just based on what it says.

    How does that demonstrate a fault in valgrind?



  • @boomzilla said:

    Nah. Easier to just keep searching with less, IME.

    Depends what you're doing, I suppose. grep is great for reporting on MB's of logs—e.g. how many times did this particular error message occur? less is valid for finding interesting-looking errors in the first place...



  • grep -v can be very useful as well.

    And if you know what you're looking for, grep -A or -B, and sed with a pattern range.



  • @boomzilla said:

    How does that demonstrate a fault in valgrind?

    In this case, valgrind was reporting uninitialized memory and "fixing" that caused the PRNG to be seeded without a source of entropy and only using the process ID.



  • Oh, and don't forget to consider diff against the trace output of a reference run, if you're trying to figure out where a functional behaviour change came from.



  • Don't forget you can instrument your binary for valgrind as well, though there's some slight performance impact. That's been specifically useful for me with custom memory allocators. Individual alloc bulk free, for example, cannot be simply switched to normal malloc/free.

    EDIT: if interested, check out http://valgrind.org/docs/manual/mc-manual.html#mc-manual.clientreqs and the chapter after that on how to tell valgrind how your program works.



  • @powerlord said:

    In this case, valgrind was reporting uninitialized memory and "fixing" that caused the PRNG to be seeded without a source of entropy and only using the process ID.

    And valgrind correctly reported the uninitialized memory. The "fixing" wasn't done by valgrind. It just found that the code was using a dangerous way of generating entropy.



  • Is anyone else being reminded of vaginas in this thread?



  • I am now.


  • BINNED

    @loopback0 said:

    I am now.

    And then there's the section it's under in my IDE...

    I'll never be able to look at this application the same again... Thanks muchly. Bastards.



  • @tar said:

    Worst. Linux. Joke. Ever. ;P

    Assuming it's a joke. Because I haven't tested it, and knowing FOSStards, everything can happen.


    Filed under: FOSStard sounds like mustard



  • Colonel FOSStard, in the library, with the kill -9 -1.

    Filed under; Cluedo, or Cluedon't?


  • BINNED

    @Gaska said:

    Assuming it's a joke. Because I haven't tested it, and knowing FOSStards, everything can happen.

    onyx:~ $ /bin/sh
    $ echo $SHELL 
    /bin/bash
    $ readlink /bin/sh
    dash
    
    

    What... the... duck?



  • I wonder how many obscene anagrams can be made from VALGRIND FUNCTION?



  • Paging @algorythmics



  • Unclad Font Virgin :O



  • @boomzilla said:

    The "fixing" wasn't done by valgrind. It just found that the code was using a dangerous way of generating entropy.

    And what the actual problem was wasn't removing the seeding based on uninitialized memory (if OpenSSL was relying on that, that would be entirely on OpenSSL's plate IMO), it was that person removing the accesses of the uninitialized memory was overeager and also removed the seeding based on everything else.

    In fact, according to comments on the page originally linked, the line that originally triggered the overreaction was removed anyway in Debian even after that whole fiasco was discovered.

    And while I can see where the OpenSSL people are coming from, I also disagree and think that removing it was the right thing to do. Just because a warning is false doesn't mean you shouldn't change the code to remove it, because writing code for analyzability by tools means that, well, your code is analyzable by tools1 and that's a good thing, especially for something that should be hit by a battery of them like OpenSSL.

    1Disclaimer: I work on kind of similar tools; that I think they are a good idea is unsurprising. :-)



  • —————————————————————————————————————————————————————————————————————— 15:23:13
    kane@kane-laptop [~/projects/cpe225/3-lab]
    $ dash
    —————————————————————————————————————————————————————————————————————— \t\n\u@\
    h [\w]\n$ echo $SHELL
    /bin/bash
    —————————————————————————————————————————————————————————————————————— \t\n\u@\
    h [\w]\n$ 
    

  • BINNED

    Well, that's a new one.

    I mean, I'm assuming that wasn't the plan when you hit paste...



  • Nope, the post is correct. That's what my terminal is showing, albeit with the wrong colors.

    It wasn't the plan when I ran dash to get that broken prompt, though.


  • BINNED

    I will now pretend to understand why you have escape sequences strewn around your terminal...



  • It seems that dash is reading my .bashrc.


  • BINNED

    Ah. Ok, that makes sense. I thought it was on purpose :P



  • or $PS1 is carrying over.

    Dash isn't intended as interactive, so it's not surprising it's not accepting the escapes.

    • \t for timestamp
    • \n for newline
    • \u for logged-in username
    • \h for hostname
    • \w for working directory

    Shouldn't that $ be escaped so it turns into a # in a root shell? Or was that the other way round?



  • @riking said:

    It seems that dash is reading my .bashrc.

    Bad Idea: A new shell combining the worst features of sh, csh, bash, etc. You could make a real hash of it.



  • I think you're right, but it won't normally show up in a root shell, I have to try:

    ——————————————————————————————————————————————————————————————————————— 15:35:13
    kane@kane-laptop [~]
    $ sudo su
    root@kane-laptop:/home/kane# exit
    ——————————————————————————————————————————————————————————————————————— 15:35:18
    kane@kane-laptop [~]
    $ sudo -E bash
    ——————————————————————————————————————————————————————————————————————— 15:35:21
    root@kane-laptop [~]
    $ exit
    


  • Depends whether root's init scripts override the shell. Not at a linux box right now.

    I've actually got a somewhat similar prompt at work, without the timestamp (nice idea need to add that) but with a newline before the final $ because I relatively frequently have paths extend for more than half the width of the terminal.



  • @HardwareGeek said:

    A new shell combining the worst features of sh, csh, bash

    Is that not ksh :laughing:


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.