Subversion comments



  • In ascending chronological order, over seven months:

    + snoofle: All code from this point forward must be aware that it is running in a threaded system - be extremely cautious when using singleton, static or non-thread-safe variables

    + rookie: add (static) map to retain data across transactions

    + rookie: problem with map data being incorrect - unable to reproduce: add debugs

    + rookie: add more debugs to find out why map data is sometimes incorrect

    + rookie: map data for a single transaction appears to be from two different customer transactions

    + snoofle: changed HashMap to HashTable: problem with non-thread-safe HashMap in threaded environment solved!

    ...sigh

     



  • But threading is hard!



  • TRWTF is threads. If you have any programmer on your team with less than 20 years of experience, you have no hope of making a multi-threaded program work correctly.



  • @Planar said:

    TRWTF is threads. If you have any programmer on your team with less than 20 years of experience, you have no hope of making a multi-threaded program work correctly.
     

     

    Funny, I get exactly the reverse... our experienced devs with 20+ years don't understand threading...

    We have now a system with about 18 permanent threads (idling most of the time) that pass "events" from one to the next. At the end you have a sequential execution...., but with the additional benefit of locks, queues, starvation and deadlocks.

    Extra: new HW platform now has a SoC with dual core processor.. *sigh*

     

    TRWTF is people. That and thinking in terms of implementation tools instead of functionality. (I can have parallel behaviour with one thread only...)



  • Yo, I got a thread for my thread about threads.



  • Funny, I get exactly the reverse... our experienced devs with 20+ years don't understand threading...

    The programmers belong to two categories. In one of them are those who don't understand threading. And in the other, those who know they don't understand threading.



  • @Planar said:

    TRWTF is threads. If you have any programmer on your team with less than 20 years of experience, you have no hope of making a multi-threaded program work correctly.

    I must be some sort of miracle worker then.

    Seriously, threading isn't hard. You were taught what a critical section is. You know what shared data is. Where's the confusion?



  •  I can't tell which is the wtf... the fact that rookie didn't know that he needed to write thread-safe code or the fact that you thought the solution was to use an extremely outdated Java construct to fix it instead of a synchronized HashMap or ConcurrentHashMap.



  • A programmer had a problem. He thought to himself, "I know, I'll solve it with threads!". has Now problems. two he



  • Life is multi threaded.

    There are some languages where you can be quite a successful programmer and not deal with threads. This is not the case for Java.

    But I would have used a synchronized HashMap. So I can access it through the Map interface, and so it has the word synchronized glaring out at the offending programmer.



  • @Vanders said:

    @Planar said:
    TRWTF is threads. If you have any programmer on your team with less than 20 years of experience, you have no hope of making a multi-threaded program work correctly.
    I must be some sort of miracle worker then.

    Seriously, threading isn't hard. You were taught what a critical section is. You know what shared data is. Where's the confusion?

     

    This.  Is there some special broken thing in Java that makes multithreading extra hard?  Because I never have problems with it in Delphi.  You just have to keep the basic principles in mind.



  • One word - nondeterminism.

    At least, I think that is a word.

    Anyway, queue someone bringing up Why Threads are a Bad Idea (for most purposes)


  • ♿ (Parody)

    @Mason Wheeler said:

    This.  Is there some special broken thing in Java that makes multithreading extra hard?  Because I never have problems with it in Delphi.  You just have to keep the basic principles in mind.

    I suspect a lot of it stems from the frameworks used. Even though they're usually massively multi-threaded, most of the code you write probably doesn't have to worry about it because it takes care of most things for you. So when you get into a situation where you need to do something like that...you don't keep the basic principles in mind.



  • @Planar said:

    TRWTF is threads. If you have any programmer on your team with less than 20 years of experience, you have no hope of making a multi-threaded program work correctly.

    C# makes threads easy. So does Ruby, ironically since it can't fucking run them on actual hardware threads.

    How easy/hard threads are is 100% dependent on the language involved.



  • @pure said:

    One word - nondeterminism.

    At least, I think that is a word.

    Anyway, queue someone bringing up Why Threads are a Bad Idea (for most purposes)

     

    That was written in 1996 when using anything but a single processor was unusual. It is 17 years later. Move on.



  • @Rick said:

    @pure said:

    One word - nondeterminism.

    At least, I think that is a word.

    Anyway, queue someone bringing up Why Threads are a Bad Idea (for most purposes)

     

    That was written in 1996 when using anything but a single processor was unusual. It is 17 years later. Move on.

    No doubt. I particularly like the way it refers to anyone who knows a bit of C++ as a "Wizard". And more ludicrously, that people who know C++ somehow greater than people who "only" know C.

    It's right about VB programmers, though, right ?!

    Seriously, though, in my experience most developers even now don't know how to write reentrant code or avoid deadlocks (or, worse, go so far out of their way to remove deadlocks, that the app spends 90% of its time fiddling with locks or waiting for them). And this certainly doesn't just apply to youngsters.

    There's a fairly famous attempt to speed up a part of our application with threads. The release note claimed something ludicrous like 1000x better performance. Lots of work was done over lots of man-months. The result? Stuff like this:


    void GotSomeData(const char * someData)
    {
    if (Logger.IsLoggingLevelEnabled(1))
    {
    Logger.write("did some stuff with: %s", someData);
    }

    // pass the data on to the "main" thread
    messageQueueForData.Put(someData);
    

    }


    Naturally, this was executed in a whole bunch of threads which sat waiting for "someData" to pass around. Anyway, you've probably guessed that the Logger.IsLoggingLevelEnabled method is syncronised (i.e. implemented as GetLock(); DoStuff();. ReleaseLock(); or whatever), so the performance of the system was: worse



  • @powerlord said:

     I can't tell which is the wtf... the fact that rookie didn't know that he needed to write thread-safe code or the fact that you thought the solution was to use an extremely outdated Java construct to fix it instead of a synchronized HashMap or ConcurrentHashMap.

    I know to use synchronized constructs for Maps or an actual thread-safe map (eg: HashTable) - it's these people who don't get it...


  • @Mason Wheeler said:

    @Vanders said:

    @Planar said:
    TRWTF is threads. If you have any programmer on your team with less than 20 years of experience, you have no hope of making a multi-threaded program work correctly.

    I must be some sort of miracle worker then.

    Seriously, threading isn't hard. You were taught what a critical section is. You know what shared data is. Where's the confusion?

     

    This.  Is there some special broken thing in Java that makes multithreading extra hard?  Because I never have problems with it in Delphi.  You just have to keep the basic principles in mind.


    You know those principles. Apparently the whole concept of "shared data" has been so overloaded with fear by now that the cutting-edge software stacks like node.js rather clone half their runtime for each thread and require developers to pass objects (and functions) around as strings than to allow a single bit of shared state. See (the second half of) this.



  • @Mason Wheeler said:

    @Vanders said:

    @Planar said:
    TRWTF is threads. If you have any programmer on your team with less than 20 years of experience, you have no hope of making a multi-threaded program work correctly.
    I must be some sort of miracle worker then.

    Seriously, threading isn't hard. You were taught what a critical section is. You know what shared data is. Where's the confusion?

     

    This.  Is there some special broken thing in Java that makes multithreading extra hard?  Because I never have problems with it in Delphi.  You just have to keep the basic principles in mind.

     

     

      public class MyClass extends Thread {
        public void run() {
           // do unsynchronized stuff in thread
           synchronized (MyClass.class) {
             // do stuff in critical section
           }
           // back to unsynchronized code
        }
      }
    

    How hard is it to do that and choose the right data structures?

    Java even lets you make a map that is not inherently thread safe into a thread safe map with a wrapper (Collections.synchronizedMap(new HashMap<integer,string>());</integer,string>

     



  •  The real WTF is that you don't do code reviews.



  •  I admit I have no idea how to write threaded code.

    You were taught what a critical section is. You know what shared data is.

    I can imagine what shared data might be, but I don't know what a critical section is. Are these threading terms?



  • @dhromed said:

    I can imagine what shared data might be, but I don't know what a critical section is. Are these threading terms?

    Code that accesses shared data.

     



  • @dhromed said:

    I can imagine what shared data might be, but I don't know what a critical section is. Are these threading terms?

    C# abstracts it away entirely with the "lock()" psuedo-function thing.



  • @dbomb123 said:

     The real WTF is that you don't do code reviews.

    Actually, that's mostly all I do, but there are almost 200 programmers here and there are only two of us who have more than three years of experience.

    I can only look at so much wtf-code before I need to vent about it (here) - where do you think most of my posts come from ;)



  • @snoofle said:

    Actually, that's mostly all I do, but there are almost 200 programmers here and there are only two of us who have more than three years of experience.

    What.
    The.
    Fuck.



    A code base produced by 200 neophyte programmers? What are you working on, so I know never to interact with it. Actually, that explains the commit comments - Rookie probably doesn't even know what "threaded" means.



  • @snoofle said:

    @dbomb123 said:

     The real WTF is that you don't do code reviews.

    Actually, that's mostly all I do, but there are almost 200 programmers here and there are only two of us who have more than three years of experience.

    I can only look at so much wtf-code before I need to vent about it (here) - where do you think most of my posts come from ;)

     

    I always thought, based on what you write, that they mostly come from issues encountered excavated during firefighting sessions.

     



  • TRWTF is pushing code with debug... Oh wait, TRWTF is subversion.



  • @snoofle said:

    Java even lets you make a map that is not inherently thread safe into a thread safe map with a wrapper (Collections.synchronizedMap(new HashMap());

     

     

    But that is not always enough. If you have one thread which modify the HashMap, while an other thread iterate over the map, then you will have trouble, even if you use Collections.synchronizedMap(new HashMap();


     



  • @Planar said:

    TRWTF is threads. If you have any programmer on your team with less than 20 years of experience, you have no hope of making a multi-threaded program work correctly.
    http://www.amazon.com/Multithreading-Applications-Win32-Complete-Threads/dp/0201442345
    I read it 15 years ago, and i still know where my copy of it is.



  • @SuperJames74 said:

    Yo, I got a thread for my thread about threads.

    Hah, memes; Comedy of the scrupulously unfunny. The best part about parrot comedy is not having to know if it's funny or not before regurgitating it.



  •  @Planar said:

    TRWTF is threads. If you have any programmer on your team with less than 20 years of experience, you have no hope of making a multi-threaded program work correctly.

    That sounds more like a programmer WTF than a thread WTF. But then, isn't everything?

    @mikeTheLiar said:

    @snoofle said:
    Actually, that's mostly all I do, but there are almost 200 programmers here and there are only two of us who have more than three years of experience.

    What.
    The.
    Fuck.

    I had to re-read that also. Has anyone looked at reasons behind the fast turnover? Or does WTFCorp purposely employ inexperienced coders because they're cheaper? Or is WTFCorp used purely to gain some experience (and fluff out a CV) before it becomes nothing more than a stepping stone to better positions?



  • @Cassidy said:

    I had to re-read that also. Has anyone looked at reasons behind the fast turnover? Or does WTFCorp purposely employ inexperienced coders because they're cheaper? Or is WTFCorp used purely to gain some experience (and fluff out a CV) before it becomes nothing more than a stepping stone to better positions?

    You've read the previous stories, right? It's not like the experienced coders at this company were doing much better.

    In fact, this whole case is a perfect demonstration of "A employees hire A employees, B employees hire C employees."



  • @blakeyrat said:

    You've read the previous stories, right?
     

    Yeah... they gave me the impression WTFCorp was populated with incompetant untouchable empire-building DBAs, idiot over-committing salesmen, fuckwitted project managers and doped-up decision-makers.

    For some reason I hadn't thought too deeply about the coders.  Shoulda guessed.



  • @dhromed said:

    I can imagine what shared data might be, but I don't know what a critical section is. Are these threading terms?

    Yes.

    Shared data is stuff that multiple threads can see. If the shared stuff is read-only, there's no problem with it. But if multiple threads can write to shared data, it often happens that the order in which they access it becomes something that needs careful consideration.

    The standard example is as simple as balance = balance + payment. If balance is shared data, it's possible (and if the code runs for long enough, inevitable) that two threads will execute the code implementing that statement concurrently. Both will read the initial value of balance, both will independently add some payment to that value, and both will write it back - but the first thread to do so will have its work overwritten by the second, and one of the payments is lost.

    To stop that kind of thing from happening, code modifying the shared data needs to be topped by code that acquires a lock, and tailed by code that releases it (a lock being something that only allows itself to be acquired by one thread at a time). This guarantees that the code between lock acquisition and lock release can only ever be run by one thread at a time, which makes it a critical section.

    Failure to identify critical sections is far and away the most common way for multi-threaded designs to go wrong. I've actually seen a colleague write this in C code for an embedded device with no filesystem:

    count++; count++; /* multiple threads, so can't use count += 2 */

    He seemed quite distressed when I proved to him with the debugger that the compiler emitted identical (non thread-safe) code for both. Apparently he'd relied on this antipattern quite extensively in prior projects elsewhere, having at some point noticed that the first C compiler he'd ever used implemented ++ on char variables as a single (and therefore non-interruptible) INC instruction for the targeted 6502 processor.

    Critical sections come with some non-obvious gotchas. If code inside a critical section attempts to acquire a second lock, then it may well be that some other thread has (a) already acquired the second lock in the process of entering some other critical section and (b) is itself attempting to enter this critical section. If that happens, both threads will wait forever: deadlock.

    Another fairly subtle thread gotcha, which can happen in multi-priority threading schemes where lower priority threads are given CPU time only when no higher-priority thread is ready to run, is priority inversion.



  • @Soviut said:

    The best part about parrot comedy is not having to know if it's funny or not before regurgitating it.

    Beautiful plumage!


  • Discourse touched me in a no-no place

    @flabdablet said:

    Shared data is stuff that multiple threads can see. If the shared stuff is read-only, there's no problem with it. But if multiple threads can write to shared data, it often happens that the order in which they access it becomes something that needs careful consideration.
    You're slightly off. Shared data is tricky when it can be seen from multiple threads, even when only one of those threads does modifications. It used to be that only concurrent modifications were tricky, but modern memory management hardware doesn't make that promise any more unless you insert a memory barrier, which is part of what a critical section does. This is a leaky abstraction — when dealing with threads, you can't pretend that what the hardware really does is unimportant — and is a large part of why doing threading right requires an exceptionally careful approach. That's why threads are HARD for most programmers.

    An example of why threads can be horribly non-obvious was the Python GIL, which is one of the few things where I've actually said “WTF” out loud in response to in my office. Everything is compounded by the fact that the people who wrote it absolutely should have known better. Multi-process programming is much easier, because there you're (almost always) restricted to passing messages back and forth, which tames most of the thread craziness.@Ben L. said:

    A programmer had a problem. He thought to himself, "I know, I'll solve it with threads!". has Now problems. two he
    This is painfully true, and hardly anyone wants to admit it. (Except two of those words ought to be written on top of each other for the true awfulness of threading to be clear.)



  • It is the year 2013 and CPU speeds have not significantly increased in almost a decade. Whether threads are difficult or not is irrelevant. If we want our programs to perform better than they did ten years ago we need programmers with competency in this "new" technology. The 'C' language was quite a bit harder than FORTRAN. move on.



  • @Rick said:

    It is the year 2013 and CPU speeds have not significantly increased in almost a decade.
     

    There's a Pentium III from 2002 clocking at 1.4GHz. My current one is an AMD 3.6GHz.

    So, by "not significant" you mean "more than doubled" or...?

    @Rick said:

    Whether threads are difficult or not is irrelevant. If we want our programs to perform better than they did ten years ago we need programmers with competency in this "new" technology.

    That I can agree with!



  • @flabdablet said:

    Yes.
     

    I am now slightly smarter than before.



  • @dhromed said:

    @Rick said:

    It is the year 2013 and CPU speeds have not significantly increased in almost a decade.
     

    There's a Pentium III from 2002 clocking at 1.4GHz. My current one is an AMD 3.6GHz.

    So, by "not significant" you mean "more than doubled" or...?

     

    In 2003, we saw the first 3 GHz CPUs.  So by "not significant," he means "not significant."

     



  • @Rick said:

    It is the year 2013 and CPU speeds have not significantly increased in almost a decade.

    You realize hertz isn't the only measure of CPU speed right? My laptop's Core i5 1.9 ghz is a fuckload faster than the Pentium 4 you had a decade ago, even if you restrict it to a single core, and even though the P4 runs at 3.05 ghz.

    As a Mac user, we had that lesson hammered in by the PPC advertising which attempted to convince consumers (mostly unsuccessfully) that a PPC G5 at 1.5 ghz was faster than a Pentium 4 at 3 ghz. It was true but sadly most people just didn't get it. Like you apparently. Although being put in the Xbox 360 helped I guess...



  • @Mason Wheeler said:

    In 2003, we saw the first 3 GHz CPUs. 
     

    Which one do you mean? The first Celeron to break 3GHz was from late 2004, and CPU's like the Celeron really don't count here because they were ssssssssslllllllllllooooowwwwwwwww. If we're discussing a general statement like "CPU speeds" and need a quick measure, we should only count CPUs whose clock speed actually means something, i.e. the likes of PPC, Pentium, Core, Athlon and Phenom.



  • @blakeyrat said:

    @Rick said:
    It is the year 2013 and CPU speeds have not significantly increased in almost a decade.

    You realize hertz isn't the only measure of CPU speed right? My laptop's Core i5 1.9 ghz is a fuckload faster than the Pentium 4 you had a decade ago, even if you restrict it to a single core, and even though the P4 runs at 3.05 ghz.

    As a Mac user, we had that lesson hammered in by the PPC advertising which attempted to convince consumers (mostly unsuccessfully) that a PPC G5 at 1.5 ghz was faster than a Pentium 4 at 3 ghz. It was true but sadly most people just didn't get it. Like you apparently. Although being put in the Xbox 360 helped I guess...

    In 1994, Pentiums ran at 100 Mhz. And less than 10 years later they ran 30 times faster. Are you saying that 2013 individual CPU cores run anywhere near 30 times faster than 2003 CPU cores? I think 'people like me' do understand this math.


  • @Rick said:

    In 1994, Pentiums ran at 100 Mhz. And less than 10 years later they ran 30 times faster. Are you saying that 2013 individual CPU cores run anywhere near 30 times faster than 2003 CPU cores? I think 'people like me' do understand this math.

    Hertz is just the speed of the clock cycle. Modern chips can do more than one operation per cycle-- for example those G5s were doing up to (IIRC) 4 floating-point and 2 fixed-point operations in every CPU cycle. So if your code was 100% optimized, and you compared it to a (say) 68040 of the same clock speed, the G5 would be up to 6 times faster-- EVEN AT THE SAME HERTZ. (Actually IIRC, even the later 68k series had multiple parallel execution units... this is hardly breaking news.)

    Now of course things aren't that simple because often a lot of those execution units go empty. (Although the nice thing about the PPC series is that the floating-point execution units could also double as fixed-point, since you used fixed-point more often.) But the point is, clock cycles are a terrible way of measuring CPU performance if that's all you look at.

    On a more practical level, do you actually think that when Intel moved from the hyperthreaded Pentium IV at 3.06 ghz to the dual-core Core i2 at, what, 1.8 ghz, they were actually releasing a newer chip with *less* performance than the chip it obsoleted? Do you think that is a sane thing for a chip maker to do? Think about it. Engage the rusty gears in your brain, douse them in oil poured into your ear canal, and let them chug around for a few minutes. Maybe... maybe what I'm saying it *fucking blatantly obvious to anybody even slightly conscious of CPU development in the last decade?* Maybe?


  • Discourse touched me in a no-no place

    @blakeyrat said:

    On a more practical level, do you actually think that when Intel moved from the hyperthreaded Pentium IV at 3.06 ghz to the dual-core Core i2 at, what, 1.8 ghz, they were actually releasing a newer chip with less performance than the chip it obsoleted? Do you think that is a sane thing for a chip maker to do? Think about it. Engage the rusty gears in your brain, douse them in oil poured into your ear canal, and let them chug around for a few minutes. Maybe... maybe what I'm saying it fucking blatantly obvious to anybody even slightly conscious of CPU development in the last decade? Maybe?
    There are quite a few things that chip makers optimize for, and speed is only one of them. Limiting power input is another; yes, for mobile computing, but also for servers where keeping the power down lets you pack more processing power in per rack. A third thing that is optimized for is the amount of electrical noise generated by all those switching transistors on the chip; I vaguely remember hearing about a decade ago that dropping the noise ceiling by 5dB was worth millions. (Or, well, I think it was something like that. I've not worked in the field since 2002.) I think there's a problem somewhere a bit above 3GHz where some of the engineering problems become rapidly much harder.

    It is, however, fascinating that computers have definitely become substantially faster over then past 10 years. Their headline clock frequency might not have changed much, but the programs you run on them now go much faster. That's definitely true even for single-threaded code; I've measured it (and no, the test wasn't I/O-bound or paged out).



  • @dkf said:

    Their headline clock frequency might not have changed much, but the programs you run on them now go much faster. That's definitely true even for single-threaded code; I've measured it (and no, the test wasn't I/O-bound or paged out).


    That's because processors are executing instructions before the program hits them. Something like a − b + c × d compiles to "Load a. Load b. Add b to a. Load c. Load d. Multiply c by d. Subtract c' from a'." A modern compiler will look ahead and realize all four loads can be done at once, and that the addition and multiplication don't depend on each other. That takes 7 sequential instructions and turns them into 3 sets of instructions that can be executed in less than half the time. The processor isn't faster - although the programs will finish their work more quickly - it just uses more of its capabilities at the same time.



  • @snoofle said:

      public class MyClass extends Thread {
    public void run() {
    // do unsynchronized stuff in thread
    synchronized (MyClass.class) {
    // do stuff in critical section
    }
    // back to unsynchronized code
    }
    }

    How hard is it to do that and choose the right data structures?

    Of course, when introducing people to the "synchronized" keyword in Java, one common result seems to be that the application returns to completely serial execution because everything gets synchronized. On the bright side, this does eliminate the race conditions in the program... *sigh*



  • @edgsousa said:

    @Planar said:
    TRWTF is threads. If you have any programmer on your team with less than 20 years of experience, you have no hope of making a multi-threaded program work correctly.
    Funny, I get exactly the reverse... our experienced devs with 20+ years don't understand threading...
    Um, those two statements are not mutually exclusive, which might explain why you don't understand threads. ;)

    Threads are not very difficult, but I find very little use for them. Some time ago, I had to write a daemon that regularly checks a database and sends different kinds of requests to middleware as SOAP requests. Those go into a thread pool, for which Java has some nifty classes. Create a thread pool of a certain size, chuck all tasks into it, and wait for them to finish. The tasks are self-contained in terms of data, so no problems there.

    But if you write web services, or applications that run in active/active mode, in a sense you get the same issues as with threading, except that the locking takes place in for example a database rather than via a Java construct.

    We have had the issue that one quite important application couldn't keep up because it ran single-threaded. Luckily, though, the developer had provided a configurable setting to increase the number of threads. What we didn't know, though, is that he also liked to use static structures. In Java, that means that they're part of the class, not of the object instance, and therefore shared between all instances of that object. Not a lot of fun if this happens to live data.

     



  • @mikeTheLiar said:

    But threading is hard!
    Writing threaded code isn't hard. Debugging it is.



  • @dkf said:

    @flabdablet said:
    Shared data is stuff that multiple threads can see. If the shared stuff is read-only, there's no problem with it. But if multiple threads can write to shared data, it often happens that the order in which they access it becomes something that needs careful consideration.
    You're slightly off. Shared data is tricky when it can be seen from multiple threads, even when only one of those threads does modifications.

    If some thread's modifying the shared data, it's clearly not read-only.

    But I take the point that I should not have said "multiple threads".


Log in to reply