X * f(x) semantics



  • @gwowen said:

    @gwowen said:
    sizeof is a compile-time operator, and has nothing to do with the runtime values of the variables or types whose size it yields.

    ... but everything to do with their types. So if sizeof(x) != sizeof(an address) then x is not an address. It is an array type. The value of an array type is not its address.

    In C, the definition
    [code]int x[2] = {1,2};[/code]
    defines an object of type array-of-int with value {1,2}.
    It does not declare a pointer, it does not declare an address.

    It declares an array object (whose token decomposes to a pointer-to-its-first-element in certain contexts, in a puff of bad design worthy of PHP).

    You're quite right inasmuch as that definition does define the object you describe. It does not, however, associate the value {1,2} with the name x. It associates the address of the value 1 with the name x, and that address is of type constant pointer to int.

    At runtime, C code has no way to refer to any property of any named array other than the address of its first element, which is what the name always resolves to. The value of a C array variable is the address of the allocated array's first element; the array itself always remains anonymous. It's not even an object, as the compiler makes no attempt to stop you indexing beyond its bounds. If you want object-like entities in C, use structs. Bury arrays inside them if you like.

    Again, you might not like that design decision (clearly you don't) but it is what it is, it's in the spec, it is applied consistently, and the fact that it makes certain cases of call-by-value look a bit like call-by-reference if you squint and look at them in a dim light does not alter that.



  • @HardwareGeek said:

    I have on occasion spent weeks debugging such things

    As a result of that experience, I would expect you to be fairly reliable when it comes to writing multi-threaded code correctly.



  • @ben_lubar said:

    in C, is undefined behavior allowed to act at compile time or only when the program is run?

    As far as I know, the nasal demons are never invoked until you attempt to execute the compiler's output, assuming of course that the compiler itself does not rely on UB internally.



  • @gleemonk said:

    i=1; i = i++ * i++;

    @gleemonk said:

    unspecified

    TIL i++ isn't necessarily atomic in C but can be distributed over adjacent other operations.

    (I thought - and am still inclined to think - it is equivalent to
    [code]
    push i
    incr i
    [/code]
    )

    I would have expected ambiguity explained with i = 1; i = i++ * i--;, i = 1; i = i++ * ++i; or similar.



  • @flabdablet said:

    As a result of that experience, I would expect you to be fairly reliable when it comes to writing multi-threaded code correctly.

    Hmm, I know enough to know there's a lot I don't know, but not enough to know how much I don't know.

    HDLs are inherently massively multi-threaded; every block is a thread. But it's all implicit; all the scheduling and synchronization is handled by the runtime. EDIT: That's not entirely true; there are explicit fork-join blocks, but the need to use them is fairly uncommon.

    Other than that, my experience with multi-threaded code is extremely limited. One time in a Java personal project, I needed to do something — I don't remember what — that couldn't be done from the GUI thread, so I had to explicitly create a thread to do it. Another time, an interviewer asked me to write some multi-threaded code. "I don't have any experience with that." "Would you like to give it a try? I'll coach you through it." I think I did ok, given the low initial expectations, but I didn't get the job for other reasons.


  • Discourse touched me in a no-no place

    @HardwareGeek said:

    Hmm, I know enough to know there's a lot I don't know, but not enough to know how much I don't know.

    Hah! Yes, you'd do just fine with a little bit of reading up on what the various patterns for working with things are. You know not to trust the machine and so not to take shortcuts.


  • Discourse touched me in a no-no place

    @ben_lubar said:

    I want to have a compiler that always does the wrong thing when it can.

    Assume that mathematical identities hold when doing arithmetic. That's a classic. Also make it so that integers are truncated when they overflow or underflow. Either of those will fill the compiler's output with subtle landmines for anyone trying to work with large amounts of data…



  • @PWolff said:

    TIL i++ isn't necessarily atomic in C

    An awful lot of broken multi-threaded code has rested on the assumption that i++; or ++i; or even i += 1; generate code that's in some way different from that generated by i = i + 1;.

    But even in single-threaded code, order of evaluation between sequence points has always been unspecified in the C standard - quite deliberately so, giving the compiler as much freedom as possible to generate performant code - and there have never been sequence points defined around evaluation of auto-incrementing terms within expressions.

    Also worth bearing in mind that in C, assignment is an expression operator, not a statement class: i += ++i has unspecified behavior too.



  • @PWolff said:

    (I thought - and am still inclined to think - it is equivalent to

    push i
    incr i

    )

    ++i is just shorthand for (i += 1).

    It's tempting to think of i++ as shorthand for (i_ = i, i += 1, i_) where i_ is a temporary anonymous variable, but it really isn't; the expanded form includes sequence points that don't exist for i++. The only thing you can guarantee about postfix increment or decrement operators is that their effect will be applied after the associated value is read and before the next sequence point.

    If your push and incr above are intended to represent generated code, you need to be aware that in-place increment and decrement instructions are typically atomic only for values held in machine registers, which you can't in general guarantee that any C variable will be. There are some architectures that implement bus locking around in-memory read/modify/write instructions (I seem to recall that the 68000 family does this). There are other architectures where such locking is available explicitly.

    C compilers are not obliged to use any of that to implement ++ and -- operators. If you need an atomic increment or decrement for a variable shared across threads, you need to use an appropriate library function. As a bonus, function call semantics (even if actually implemented using a single inline instruction) put a sequence point after function return.

    Edit: This is good.



  • std::atomic<T*> for some type T

    How legal is that actually, though? I mean, obviously it's legal to pass the pointer itself, but if your goal here is to pass the data that is being pointed to across threads, how do you ensure that the receiver is going to be getting valid data when they try to deference it?



  • If any thread contains code that does naive read/modify/write operations on the shared item, there's no guarantee at all.

    If every thread uses the atomic locked primitives for that, correct update behavior is guaranteed - by the hardware if possible, or by surrounding critical sections with software mutexes if not.

    Edit: this is good.


  • Java Dev

    @flabdablet said:

    It associates the address of the value 1 with the name x, and that address is of type constant pointer to int.

    This might be what the compiled code ends up doing, but it's still false, as evidenced by the behaviour of the sizeof and & operators in the scope where the variable is declared.



  • So every member, and every member's member, of T would need to be a std::atomic? Fair enough, though I still think it's irresponsible for an article so boldly proclaiming that “You Can Do Any Kind of Atomic Read-Modify-Write Operation” not to mention that caveat. I mean, that article is obviously not targeted at people who would already know about subtle gotchas like that, and std::atomic<struct{ int x; int y; }> is clearly shown as working. So why then suggest passing a pointer atomically without pointing out that there's no guarantee that you'll be able to deference it on the other side.

    Like, I'm picturing a lock-free queue. The producer never writes to an object after placing it in the queue, and the consumer never reads from an object before retrieving it. So someone might think it's safe to pass those messages by reference, because there appears to be a transfer of ownership. But what are the odds the next message is going to be created on the same cache line as the current one, and therefore not read by the receiving thread at all if you're not careful.

    In fact, forget about cache invalidation: even if you are accessing each individual field atomically, the order of the individual atomic operations isn't defined, is it? So the object can be placed into the queue before it has been created. Lock free programming is snake oil.



  • Ah, OK. You've gone a step beyond.

    Here's the contentious section:

    This brings up an interesting point: In general, the C++11 standard does not guarantee that atomic operations will be lock-free. There are simply too many CPU architectures to support and too many ways to specialize the std::atomic<> template. You need to check with your compiler to make absolutely sure. In practice, though, it’s pretty safe to assume that atomic operations are lock-free when all of the following conditions are true:
    1. The compiler is a recent version MSVC, GCC or Clang.
    2. The target processor is x86, x64 or ARMv7 (and possibly others).
    3. The atomic type is std::atomic<uint32_t>, std::atomic<uint64_t> or std::atomic<T*> for some type T.

    All the author is really saying here is that you're unlikely to see lock-free (i.e. hardware) support for atomicity if the types you're trying to make atomic are bigger than 64 bits.

    I didn't personally read that as encouragement to go forth and conquer using lock-free pointer updating, but then I might not be in the target audience you're assuming.

    @Buddy said:

    Lock free programming is snake oil.

    Snake oil is something with purported but no actual benefits. Lock-free programming has measurable and significant performance benefits especially given high thread counts.

    I will agree that lock free programming probably does count as a nest of vipers, though. Very easy to get bitten.



  • @PWolff said:

    TIL i++ isn't necessarily atomic in C but can be distributed over adjacent other operations.

    i++ isn't atomic in anything, is it? Pretty sure it always represents at least 2 atomic operations.

    It also can't possibly be atomic in any language with operator overloading, like C++ for example, for reasons that would be obvious if you spend a few milliseconds thinking about it.



  • You're right - My experience with C is not very deep, and very long ago. I'm not sure whether I remember about having heard about this or not. But I remember there were much more caveats than built-in functions.


  • Discourse touched me in a no-no place

    @flabdablet said:

    C compilers are not obliged to use any of that to implement ++ and -- operators.

    @blakeyrat said:

    i++ isn't atomic in anything, is it?

    Precisely correct. At the low level, you have to do load, modify and store (with pre- and post-increment being distinguished by which value you keep in the result register/on the stack afterwards). While with a single-core processor such sequences would usually run just fine without anything going wrong because the load makes sure that the store doesn't trip a minor memory fault, all that goes straight out of the window as soon as you have a multi-core system; if you didn't lock the bus, the other cores really can shit all over your memory in unexpected ways. If you're lucky, the hardware will notice and patch up most of the mess for you. But hardware that can do that (especially if the cores are located on different pieces of silicon) is really expensive, so what actually happens is that the shit hits the fan in ways that are hard to explain ahead of time. (I think you can usually explain it by thinking about cache lines and so on, but I could be way wrong.)

    In multi-threaded programming, it's safest to assume that the computer hates you and will maliciously choose the most evil way it can to obey your instructions while not doing what you want. Use the threading library correctly and don't try to get “clever”: you'll probably just make a weird bug that will take you forever to hunt down.


  • Discourse touched me in a no-no place

    @flabdablet said:

    Lock-free programming has measurable and significant performance benefits especially given high thread counts.

    Fully lock-free programming is hard, but you can do a lot with partitioned resource spaces so that you usually only need a lock when messaging other threads (and then you use a queue with copying so that you don't need to use fancy locks). With that, you can build something that looks like CSP and that's a pretty tractable parallelism model.

    But it is pretty coarse-grained in most implementations, and doesn't let you do some of the sophisticated things that have been developed with parallel algorithms for mathematical operations. Those are useful too.



  • OK, I have a challenge for you all. We can tell immediately that the code that follows is UB:
    [code]
    unsigned char bp = / parameter of a function /;
    unsigned n = /
    another parameter */;

    unsigned char v[16] = { 0x0, 0x8, 0x4, 0xC, 0x2, 0xA, 0x6, 0xE, 0x1, 0x9, 0x5, 0xD, 0x3, 0xB, 0x7, 0xF };

    while( n-- != 0 )
    *bp++ = v[(*bp) >> 4] || (v[*bp & 0x0F] << 4);
    [/code]
    But when I encountered it, every compiler in the company I worked for at the time did the same thing, and the code had survived because that one thing matched the naive UB-ignoring interpretation: calculate the RHS, store the value, increment the pointer. Because they all did the naive interpretation and that was what everyone wanted it to do, nobody noticed that it was UB.

    When I say every compiler, of course, I mean every compiler except one, that is. And that one was the one my group was using for their real hardware, using gcc2.5.8 or so to cross-compile to a MIPS R4600.

    That compiler did what it was allowed to do, and produced a completely wacked-out interpretation: note the current value of bp, increment bp, evaluate the RHS against the modified value of bp, and then store the value through the version of bp that we noted at the beginning.

    For ... reasons ... that code was called twice against the same memory, thus shifting the data down by two bytes and causing symptoms that resembled something else that I was expecting to see. It took three days of inserting printf()s in various places before I put one between the two calls and saw the bit-reversed version one byte down from where it should have been.

    That's why I have an intense dislike of UB, and a spitting contempt for people who rely on what some compiler does when presented with UB-bearing code on the grounds that it works for them.

    EDIT: no guarantee that the values in the v array are correct, nor that they are sufficient to do the job.



  • @flabdablet said:

    @gleemonk said:
    The expectation that reality takes a certain path and no other is very deeply set with programmers.

    In my experience, this is more true of programmers who have not ever really got their hands dirty with hardware. Watching a machine operation fail because a D flip-flop entered a metastable state because an input edge just happened to violate a setup or hold time wrt its clock edge is a powerful reminder that digital logic is, ultimately, just another leaky abstraction.

    Oh yes, the abstraction can fail. But in this case it was people expecting a specific outcome they could predict from their basic knowledge of C. While I tried to tell them that the abstraction does not cover this case. They couldn't accept that; inventing ad-hoc rationalizations instead. Funny enough, they couldn't even settle on a common expectation on what should happen, yet they insisted that there was one true solution.


  • Discourse touched me in a no-no place

    @PleegWat said:

    the behaviour of the sizeof and & operators

    Both sizeof and & are best thought of as compile-time operators.



  • @dkf said:

    @PleegWat said:
    the behaviour of the sizeof and & operators

    Both sizeof and & are best thought of as compile-time operators.


    sizeof is a compile-time operator. It can only be evaluated at compile time.

    Monadic &, because of the many and varied ways it can be (ab)used, must be a run-time operator except in certain very, very specific cases.

    Dyadic & is inevitably a run-time operator.

    Well, except when the compiler can evaluate the subexpression using only compile-time information, and maybe even if it can. I once, many moons ago, used a C compiler (for a 68HC11 target) that would not compile-time evaluate a constant-value expression if there were any floating point values in it.


  • Java Dev

    Yes? At the code level you're looking at what you feed into the compiler, not at the compiled code that ends up running. int a[2] and int * a are different variable declarations, which act differently particularly towards the sizeof and unary & operators, even though they are likely compiled into the same thing.

    The fact they actually are the same thing in parameter declarations just further confuses the matter but is not what I'm talking about.


  • Discourse touched me in a no-no place

    @PleegWat said:

    At the code level you're looking at what you feed into the compiler, not at the compiled code that ends up running.

    There's the operations you do using the type, there's the operations you do using the data itself, and there's the operations you do using a pointer to the data. C keeps all type information strictly compile-time; at run-time, you're going to be working with machine words and possibly bytes (depends on the architecture).

    C really is just a higher-level form of assembler. There's lots of things it just doesn't give you; if you want them, you need to write them yourself (or use a different language that does more, of course).



  • @flabdablet said:

    Lock-free programming has measurable and significant performance benefits

    Sure, but only for 64-bit values. So “You Can Atomically Modify Absolutely Anything” is a misleading title.

    And given that pointers have no meaning except in the thread they were created, what was the point of even mentioning that they can be shared across threads efficiently? The compiler can turn a no-op into a single instruction, and then what?



  • @Buddy said:

    And given that pointers have no meaning except in the thread they were created
    Your right to write code has been revoked. You've demonstrated clear and flagrant incompetence. Turn in your badge.

    Pointers have meaning everywhere their address space reaches. Always across threads in a process, sometimes even across multiple processes (e.g. if you never exec() and you're not on Windows or an OS that uses CPU task gates). It's one of the first things you learn about pointers in any CS class that teaches about them, and certainly by Intro to Operating Systems.



  • @Buddy said:

    pointers have no meaning except in the thread they were created,

    Care to expand on that? Is this a point about cache consistency or what?



  • @Buddy said:

    Sure, but only for 64-bit values. So “You Can Atomically Modify Absolutely Anything” is a misleading title.

    Additionally, the technique that they use (the CAS-loop) is pretty much a spinlock, so calling it lock-free/wait-free is a bit ... suspect.



  • Basically. The point is that you're here worrying about torn reads from individual memory cells, meanwhile any use of a shared pointer is necessarily non-atomic, because you've got to load the pointer, then dereference it. And that's just pointer to int, which you've got to wonder why someone would even bother with reference semantics if they're only sharing a single word. More likely that the pointer refers to an entire region of memory, and that's where the problem arises, because the spec neither guarantees that the sending thread will be done with the object before its reference arrives in the shared location, nor that the receiving thread won't access it before then. What's the point of ensuring that each individual access is consistent when the entire memory region is not in a consistent state to begin with?

    Bottom line is: if you want to access an object from multiple threads, you're going to have to synchronize them.


  • Discourse touched me in a no-no place

    @Buddy said:

    Bottom line is: if you want to access an object from multiple threads, you're going to have to synchronize them.

    The problem is if you want to write to the object. Reading is trivial. Only with writing do you get trouble (though “writing” covers many higher-level concepts). The area has been extensively studied in the concurrency literature.



  • You could probably also avoid trouble by only writing to the shared object and never reading from it :P



  • @Buddy said:

    the spec neither guarantees that the sending thread will be done with the object before its reference arrives in the shared location, nor that the receiving thread won't access it before then.

    You can guarantee that, though, with correct coding. Any processor for which this is an issue is going to provide memory fence instructions.

    Edit: and there's high-level support as well.



  • @ben_lubar said:

    I want to have a compiler that always does the wrong thing when it can.

    Implement the DeathStation 9000 compiler? That shouldn't be too hard for you: you've implemented BIT and Cool compilers too ;)



  • @TwelveBaud said:

    Pointers have meaning everywhere their address space reaches. Always across threads in a process, sometimes even across multiple processes (e.g. if you never exec() and you're not on Windows or an OS that uses CPU task gates).

    Whaaa?

    Even in OSes with address space layout randomization? I thought preventing that was the entire point of implementing ASLR.



  • ASLR means all your relocatable libraries get relocated randomly when they're loaded (meaning you can't use their preferred base address in a 'sploit and have to go hunting instead), but they're only loaded into a process once, and on non-Windows OSes child process inherit their parent's address space (and thus library layout) until they exec(). When running a different program (thus calling exec()), or on Windows always, no shared memory or loaded libraries are preserved across the parent/child boundary and ASLR can kick in.



  • I guess I skipped past the word "sometimes" in your first explanation.

    But whatever. If you're dealing with pointers, you're using a shitty low-level language that sucks and I hate.



  • @dkf said:

    C really is just a higher-level form of assembler.

    Except where the order ops are executed can be shifted around at the compiler's whims.

    @dkf said:

    Reading is trivial.

    I thought reading wasn't trivial it you couln't be sure the object wasn't written to at the same time. Otherwise you might get inconsistent data. (Or did you mean a read-only object, or rather an object that is read-only for all threads?)


  • Discourse touched me in a no-no place

    @PWolff said:

    Except where the order ops are executed can be shifted around at the compiler's whims.

    Assembler doesn't really have expressions in any notion where that makes sense in the first place, so it's not something I'd fairly label as an “except”…


  • Discourse touched me in a no-no place

    @PWolff said:

    I thought reading wasn't trivial it you couln't be sure the object wasn't written to at the same time. Otherwise you might get inconsistent data. (Or did you mean a read-only object, or rather an object that is read-only for all threads?)

    If you can guarantee that nobody's writing to it (often pretty easy in practice, whether through the memory location being entirely read-only, or by there being some sort of natural sequence point between the data being written and being read, such as thread-creation or thread-termination-and-join) then you can share it between threads trivially. It's only with writes that stuff gets complicated, and that's mainly because you don't want inconsistent states to be read.

    Caches make all this much more tricky. Especially when you go to using off-die coherency because you've got more than one CPU on the board. (Getting things right for multi-core CPUs is much simpler.) That is one of the main areas where supercomputers really are better than desktop machines.



  • @PWolff said:

    I thought reading wasn't trivial it you couln't be sure the object wasn't written to at the same time.

    If you have some object that a bunch of different threads have access to, and all they ever do is read it, then reading is trivial in the sense that you don't need to do anything clever to prevent torn reads. That is, if the shared object is not ever going to change, tear the reads up all you like and everything will be fine. Config information shared across threads is an example of this kind of use case.

    If you have a bunch of threads sharing an object and at least one can write to it, that's when the vipers can start to bite. For objects smaller than 64 bits, you generally get direct hardware support for avoiding torn reads and writes: for example, on x86 you'd use use LOCK CMPXCHG or LOCK CMPXCHG8B for both. For reads, x86 doesn't allow the use of the LOCK prefix with a straight MOV instruction, so the usual idiom is to use a degenerate compare-and-swap: make sure the comparison and destination register values are equal, execute LOCK CMPXCHG, then ignore the resulting condition code.

    For bigger objects, you need a correctly written access protocol in the software. This is conventionally done using mutex or spinlock concurrency primitives, both of which need to be built on top of exactly the kind of hardware-assisted small-object sharing outlined above. Of all the error-prone ways to do object sharing across threads, this is the least error-prone because most of the vipers are safely caged inside the implementation of the concurrency primitives. But those primitives are certainly not the only thing that can be built on top of hardware-assisted small-object sharing.

    Buddy mentioned one pattern above: build some enormous object in one thread, and then once it's fully built, atomically update a shared pointer so that other threads can start reading it. This can be made to work, but as he says it has plenty of potential gotchas. You need to use a fence between building the object and updating the shared pointer, to make sure that the shared pointer's readers will indeed be looking at valid data once they start to dereference the pointer, but that's not enough; you also need some way to prevent and/or detect reads torn between old and new versions of the shared object, and you need to make sure that the old version can never get deallocated until after all its potential consumers have stopped trying to read it.

    It's all quite hairy, and deliberately creating as many race conditions as this means that test driven debugging is just not going to work: you have to prove correctness, and that's developer-time expensive. But sometimes the potential performance gains compared to mutex-based approaches do make up for that.



  • @dkf said:

    Assembler doesn't really have expressions in any notion where that makes sense in the first place, so it's not something I'd fairly label as an “except”…

    Then again, modern CPUs might reorder your stuff too, so you still need memory fences in assembler (depending on your platform; x86 is kind of nice here, where it doesn't do too much weird stuff behind the scenes).



  • @blakeyrat said:

    I thought preventing that was the entire point of implementing ASLR.

    As I understand it, the point of ASLR is to give each instance of a process a unique load-time arrangement of library modules, which makes the library entry points unpredictable without direct involvement in the module loading process. That means it's no longer possible to write buffer-overflow exploit code that relies on being able to find CreateFile() at location 0x00127344 when exploiting reader.exe. Doesn't mean the sploit won't find something if it executes a function via a pointer to that location, just that it won't in general find what it was expecting.



  • @blakeyrat said:

    If you're dealing with pointers, you're using a shitty low-level language that sucks and I hate.

    ...probably to implement a library for one of those shitty high-level languages that sucks and I hate :-)



  • @cvi said:

    modern CPUs might reorder your stuff too

    Yeah, even explicit instruction-by-instruction coding is more of a guideline in this day and age.



  • @flabdablet said:

    Any processor for which this is an issue is going to provide memory fence instructions

    And those processors can implement synchronized blocks efficiently.

    @flabdablet said:

    high-level support

    That's not high-level, that's spaghetti-level.



  • @Buddy said:

    those processors can implement synchronized blocks efficiently.

    The thing about synchronized blocks, though, is that by their very nature only one thread can be executing one at any given time, which makes them a bottleneck once the thread counts get large. To avoid that bottleneck causing performance issues, you have the choice of putting only very small amounts of work inside each such block and/or redesigning your concurrency model to make the synchronization more fine-grained.

    Lock-free design is in fact still all about the synchronized blocks; it's just that those blocks have been ground fine enough to be implemented in hardware.

    Edit: This is still good



  • @dkf said:

    Caches make all this much more tricky. Especially when you go to using off-die coherency because you've got more than one CPU on the board.

    Hey, put a trigger warning or something on that.

    I had to deal with that stuff when I worked at Intel. Multiple CPUs, plus peripherals bus-mastering data into memory, from PCIe and legacy peripheral buses, each of which mapped its view of the address space to the memory controller's view of the address space differently. Of course, testing the memory controller chip focused on breaking cache coherency, with everybody and their dogs reading and writing overlapping address ranges, and the tests succeeded very well. (Presumably, they were passing by the time the chip was released; I wasn't still there.) The only diagnostic was that some read returned data that didn't match what the test expected.


  • Considered Harmful

    It's a pity the engineers were placed in such a position, if only someone could have made them some better diagnostics...


  • Discourse touched me in a no-no place

    @HardwareGeek said:

    I had to deal with that stuff when I worked at Intel. Multiple CPUs, plus peripherals bus-mastering data into memory, from PCIe and legacy peripheral buses, each of which mapped its view of the address space to the memory controller's view of the address space differently.

    And all for a feature that only really the supercomputer people will really use. (I used to be one of them, though to be fair my focus was much more on how to consume the entire memory of a supercomputer for one single-threaded task… 😉)



  • @Gribnit said:

    if only someone could have made them some better diagnostics

    The diagnostic did say which read and which data didn't match, but the point is that gazillions of writes and reads are all beating the heck out of a given cache line, and somewhere in there the cache controller has a bug, or maybe the algorithm that predicts the correct value has a bug. And the number of people on the project who actually understand it can be counted on the fingers of one hand. That count did not include me.


Log in to reply