How can we increase adoption of c/c++ alternatives?

Groaner

Okay, so we have hardware that natively executes JVM bytecode. Not surprisingly, it executes it faster than a generic CPU. How does that help everyone with decades of software compiled for x86/x64/ARM/PowerPC/etc.?

@Buddy said:

any part of your program could bump something else out of cache at any time

That's going to be a problem on any system with finite cache space.

Polygeekery

@Groaner said:

That's going to be a problem on any system with finite cache space.

You just need more cache space.

Jaloopa

@Groaner said:

Okay, so we have hardware that natively executes JVM bytecode. Not surprisingly, it executes it faster than a generic CPU. How does that help everyone with decades of software compiled for x86/x64/ARM/PowerPC/etc.?

Write an x86 JITter

Groaner

@Jaloopa said:

Write an x86 JITter

Could you use that x86 JITer to run an x86 JVM that's running yet another x86 JITer?

Polygeekery

@Groaner said:

Could you use that x86 JITer to run an x86 JVM that's running yet another x86 JITer?

Yes, but like Inception each step down increases the perception of time 10 times.

accalia

@Polygeekery said:

Yes, but like Inception each step down increases the perception of time 10 times.

and with each step down reality become less and less stable.

:-D

xaade

I don't get the luxuries of using a real TDD framework.
So I always build a lightweight simulator for every function that I add in.

Buddy

@tarunik said:

raw speed doesn't matter

That doesn't make any sense. You always want a processor that can more efficiently run the specified program, because you can clock it slower, and use less power. Reading from cache is so much more efficient than from main memory; the only reason we've never seen caches on embedded systems before is that this is the first one where the benefits could be quantified.

@Groaner said:

How does that help everyone with decades of software compiled for x86

They cannot be helped.

xaade

Microsoft created C++/CLI. Or C++.NET as a bridge from C++ to C#.

That's how you do it. Create easy to employ bridges.

COM failed miserably, and only made things harder to authenticate / implement / general life demotivational poster child.

Want backwards compatibility and a desire to commit suicide.... introducing COM.

JSON serialization is a good step forward, but you still have to write the interfaces.

tarunik

@Buddy said:

You always want a processor that can more efficiently run the specified program, because you can clock it slower, and use less power. Reading from cache is so much more efficient than from main memory; the only reason we've never seen caches on embedded systems before is that this is the first one where the benefits could be quantified.

First: my rant wasn't speaking to incremental improvements in clock speed (say through a process shrink) here -- I'm talking about the microarchitectural gulf between a microcontroller core (say an ARM Cortex-M3) running an RTOS that's little more than a canned scheduler and an applications processor (say an ARM Cortex-A8) that, as a practical necessity, will be running embedded Linux or Windows.

In the case of the microcontroller -- you'll have enough fast memory that you'll barely need any caching infrastructure at all because you aren't dealing with the overhead introduced by having to drag around a stripped-down desktop OS and all its trappings when you really don't need such a thing in a specialized device. However, this minimalist approach is less comfortable for today's coddled developers, and as a result, hardware gets piled on to fix what's really a combination software and developer wetware problem.

Filed under: how much CPU power does an elevator controller really need?

PleegWat

@tarunik said:

how much CPU power does an elevator controller really need?

Would today's youngsters believe that once upon a time they made those without any semiconductors at all?

Will the next generation?

tarunik

@PleegWat said:

Would today's youngsters believe that once upon a time they made those without any semiconductors at all?

Yep -- we're just now trying to eradicate relay interlocking from the railroad due to the PTC mandate. I'm sure there are relay elevator controllers and Ward Leonard drives hanging around old buildings still...and relay logic has its place today, especially when a fancy interface isn't needed, and the control logic requirements are rather simplistic.

Filed under: PLCs are complicated beasts

boomzilla

@tarunik said:

fix what's really a combination software and developer wetware problem.

At first I read that as welfare problem, and I thought.:

dkf

@tarunik said:

Filed under: how much CPU power does an elevator controller really need?

Going by just how crappy some elevator systems are at servicing all floors in hotels I've stayed in in the last year, more than they're currently using…

Buddy

@tarunik said:

First: my rant wasn't speaking to incremental improvements in clock speed (say through a process shrink) here -- I'm talking about the microarchitectural gulf between a microcontroller core (say an ARM Cortex-M3) running an RTOS that's little more than a canned scheduler and an applications processor (say an ARM Cortex-A8) that, as a practical necessity, will be running embedded Linux or Windows.

JOP is the former. I do believe that with enough work jop could make a better applications processor than any currently available, but that's pure speculation.

In the case of the microcontroller -- you'll have enough fast memory that you'll barely need *any caching infrastructure at all* because you aren't dealing with the overhead introduced by having to drag around a stripped-down desktop OS and all its trappings when you really don't need such a thing in a specialized device. However, this minimalist approach is less comfortable for today's coddled developers, and as a result, hardware gets piled on to fix what's really a combination software and developer wetware problem.

Ok, but that's not the way I've ever heard it: what I've been told is that caches have never been useful on embedded systems because they were too unpredictable, and to make up for the lack of a cache, embedded designers have had to use smaller, lower latency memories.

Buddy

I'd love to make a processor that could directly execute CIL bytecode, but I think it might be a little out of my league.

dkf

@Buddy said:

Ok, but that's not the way I've ever heard it: what I've been told is that caches have never been useful on embedded systems because they were too unpredictable, and to make up for the lack of a cache, embedded designers have had to use smaller, lower latency memories.

If your processor is slow enough, you don't need a cache because you can fetch stuff from memory fast enough anyway, especially if you're putting the memory on the same chip.

riking

@Groaner said:

Are any of these alternatives compatible with the vast amount of libraries that are written in C/C++? If you give people some way to interop with them without having to write a DLL wrapper with C functions, it reduces the incentive. Of course, that would require C++ to offer a stable ABI, which while an awesome goal, will likely happen when the pits of Hell open to spit out flying frozen pigs.

Go has cgo, where you can basically "import c" and do shit with it.

func Print(s string) {
    cs := C.CString(s)
    defer C.free(unsafe.Pointer(cs))
    C.fputs(cs, (*C.FILE)(C.stdout))
}

@dkf said:

That's one of the main reasons people still prefer plain C; it does have a stable ABI. (Well, it's not too hard to use it to make one.)

Meanwhile, I just saw some C++ people try to create a new ABI (register conventions) so that their assembly code could be smaller.

MOV(R(EAX), IMM(instr.foo));
ABI_CallFunction(&DoUsefulThing);

BindGPR(gpr3);

They're also feeling the SSSE3 love. Except for its name.

@boomzilla said:

They had been pressing me on technical details of an algorithm or something so eventually I talked to them like someone who knew what they were doing. I don't recall getting similar levels of questioning since.

Hah.

Buddy

Small, on-chip memories with single-cycle latencies. I'll take “what is a cache?” for the win please, Monty.

Anyway, if you take a step back and think about what the pro-C argument is here, it's basically “what we already have is good enough, why would we want something better?” which I'm taking to be an implicit confirmation of my claim that JOP is something better. And the fact of the matter, disregarding pedantry about when and where caching is appropriate and why, is that JOP's memory model is at least as much of an improvement over the Harvard (or modified Harvard) architecture as Harvard is over the Von Neumann.

dkf

@Buddy said:

Small, on-chip memories with single-cycle latencies. I'll take “what is a cache?” for the win please, Monty.

Precisely. The only reason people use caches at all is because larger storage tends to be either quite a bit slower than the processor or hellishly expensive. Caches only work because most algorithms have reasonable instruction- and data-locality.

@Buddy said:

my claim that JOP is something better

I've no idea if it is or isn't. The wikipedia page on it is very thin on details, hardly more than a Stub…

Buddy

It is.

riking

@Buddy said:

Small, on-chip memories with single-cycle latencies.

What is "One of the few valid reasons to use a linked list in your high-level application"?

Filed under: I am become Linked List, destroyer of dcache

tarunik

@Buddy said:

Small, on-chip memories with single-cycle latencies. I'll take “what is a cache?” for the win please, Monty.

No -- what I'm saying is that if you're doing it right from an embedded perspective, you don't need to have a cache, because your entire main memory is fast enough to keep up with the CPU.

@Buddy said:

Honestly, the sooner we can move away from the free-form C way of doing things, where any part of your program could bump something else out of cache at any time, to something with well-defined memory access semantics, the better.

This is the least of your memory semantics worries -- I'm all for static memory semantics analysis a la Rust especially considering that malloc() is generally a non-thing in the microcontroller world (just not enough RAM to make it something you'd do).

The broader point you're missing about speed, though, is that embedded applications are (as a general rule) not CPU-bound -- faster might be nicer in that you can spend more time in sleep, but if that speed comes at the cost of static power consumption (modern low-power microcontrollers achieve sub-microampere static power draw, which is something neither FPGAs nor APs, and not even many ASICs, can touch), you're rendering your efforts at spending more time in sleep irrelevant.

Furthermore, the JOP folks talk on and on about predicting worst-case execution time without ever taking interrupts into account -- and that's bad news.

Buddy

@tarunik said:

No -- what I'm saying is that if you're doing it right from an embedded perspective, you don't need to have a cache, because your entire main memory is fast enough to keep up with the CPU.

Yes, and what I'm saying is that that just makes it more desirable to be able to split that memory up into multiple smaller blocks, each one tailored to the type of data it will contain.

modern low-power microcontrollers achieve sub-microampere static power draw, which is something neither FPGAs nor APs, and not even many ASICs, can *touch*

JOP probably can't compete with the very lowest tier of microcontrollers; projects that are necessarily small enough that you can afford to develop them in C. What JOP can do is extend the upper range of what can be done on a simple embedded system, removing the need for people to choose a full-blooded applications processor with linux and god knows what else, just to get a friendly development environment.

Furthermore, the JOP folks talk on and on about predicting worst-case execution time without *ever* taking interrupts into account -- and that's bad news.

Is it, though? Are there no alternatives to event-driven programming?

dkf

@Buddy said:

Are there no alternatives to event-driven programming?

Lots of threads? (Hey, it's a sucky alternative, but you weren't specific.)

tarunik

@Buddy said:

Are there no alternatives to event-driven programming?

Fundamentally, you either have to sit and spin waiting on devices, or you have to handle interrupts.

Polling might work for your old PC's floppy or a timer in a bitbanged UART Tx, but it's a bad idea in a low-power system.

@Buddy said:

What JOP can do is extend the upper range of what can be done on a simple embedded system, removing the need for people to choose a full-blooded applications processor with linux and god knows what else, just to get a friendly development environment.

I'd much rather have a language where I don't have a half-CPU half-VM roaming around below me; besides, the Java model doesn't deal well with having to roll your own hardware access, or the needs of bitbanging and bitfiddling.

ben_lubar

@Buddy said:

Are there no alternatives to event-driven programming?

You don't need to do all the events in a single thread. For example, you could have one thread waiting for mouse movement while another thread waits for keyboard input.<That way, you get the worst of both worlds!>

tar

@Buddy said:

Are there no alternatives to event-driven programming?

Message passing? I'll let you decide whether that's the same as event-driven, or whether it's just treating threads like processes. (I don't know whether anyone suggested threads yet...)

dkf

@tarunik said:

Fundamentally, you either have to sit and spin waiting on devices, or you have to handle interrupts.

Polling might work for your old PC's floppy or a timer in a bitbanged UART Tx, but it's a bad idea in a low-power system.

Sit-and-spin and interrupt-handling are the techniques that tend to be used in most operating systems for this sort of thing. While the user process/thread has been suspended, the kernel keeps a queue (hmm, probably a different datastructure come to think of it) of what to wake up and for what reason. The aim is to be able to idle the processor if there's nothing to do; that interrupt will wake things up again.

Bulb

@Bort said:

You'd have to somehow compel c++ programmers to use another language for a while. They'd never consider it unless they has no choice. Once they've done it, they'll never want to go back.

Almost any language could be a C++ killer, so long as you first kill the undying loyalty of Cplusplusers.

The undying loyalty of Cplusplusers comes from the fact that C++ is relatively high on the power continuum, a thing users of languages lower on it can't see, out of principle. Any language that should have a chance of adoption by current C++ programmers must have most of the features C++ programmers learned to love: value semantics with explicit references, const references, RAII, associated types, templates with explicit and partial specializations and value parameters, ability to define “zero-cost abstractions”, preprocessor and a bunch of other useful features.

And then it needs some additional advantage, and let me tell you, garbage collector is not one. Good C++ programmers have learned to use RAII and smart pointers and generally do most things without actually involving pointers and don't have much problems with invalid pointers any more.

@antiquarian said:

…alternatives to c/c++ that I can think of: D, Rust, Go, Ada

So far only D and Rust look like they actually provide all those. D has repeated the C# mistake of baking the value vs. reference semantics into types (struct vs. class) and then it lost a lot of momentum in that silly Phobos-Tango war. Rust is brand new and just released 1.0.0-Alpha two weeks ago. Somebody did already successfully try to write bootloader and operating system in it though so it at least proves that it works well in non-hosted environment.

Ada has some nice features too, but it's too verbose (and C++ is not exactly terse).

@Arantor said:

Go always struck me as a straight-up NIH language which should never just exist in the first place.

QFT.

@dkf said:

The stuff they do with channels is quite interesting. Otherwise, meh.

Actually they are part of what is wrong with Go. Channels are not primitive concept of the underlying hardware but require significant amount of code in the runtime. A good language provides enough tools so that that runtime can be a library written in the language itself. Go instead bakes it into the core language and the tools (templates or at least generics) are not there, which essentially makes the language rely on C for these bits. And then it obviously can't replace C.

@Buddy said:

Actually, as Martin Schoeberl has proven over and over again for the past 15 years, the JVM architecture is infinitely better for embedded systems design than the plain Von Neumann or Harvard architectures we've been using, for one gigantic reason: time predictable caching.

The principles may have advantages, but anything that increases average memory consumption threefold is not going to be better in practice for devices produced in large quantities (where device cost dominates unlike server where development cost does so inefficient runtime easier to program for can win there), anything that introduces semi-random pauses for garbage collection isn't going to make hard real-time deadlines (it is possible to make parallel collector that does not cause them, but it was not done in practice) and any language that has exceptions isn't going to make hard real-time deadlines either (Rust does not have exceptions).

@Buddy said:

Honestly, the sooner we can move away from the free-form C way of doing things, where any part of your program could bump something else out of cache at any time, to something with well-defined memory access semantics, the better.

Rust apparently could do a lot in that direction without adding the problematic memory overhead, garbage collector and exceptions.

So that's where we are: the first language that ever covered the feature list of C++ and added some useful new properties made alpha release and hopefully will be making first stable release in a few months.

And then there remains the issue of portability. Writing portable code in C++ is pain in the arse, but it is possible, which is more than can be said about any of the other languages. Even when developing for mobile, each language is a problem on some of the platforms. In this case Rust will be initially lacking too, but at least it's essential glue to the underlying platform is much smaller than Java's or C#'s.

dkf

@Bulb said:

And then it needs some additional advantage, and let me tell you, garbage collector is not one. Good C++ programmers have learned to use RAII and smart pointers and generally do most things without actually involving pointers and don't have much problems with invalid pointers any more.

Either that, or you're hitting the BlubParadox from the other side.

@Bulb said:

Channels are not primitive concept of the underlying hardware but require significant amount of code in the runtime.

I believe that they would say that's the whole point.

If you can't have abstractions, you're left with an extremely low-level language, which makes doing high-level things much more difficult. When the language provides abstractions over that low-level (and yes, that requires a runtime to implement them) then it makes doing high-level things that used that general area easier, so long as one is willing to be bound by the restrictions of the abstraction. What abstractions are actually good ones is a separate debate.

To my eyes, the channel (and goroutine) abstraction should make doing parallel code quite a bit simpler, since it reduces the surface of shared state that needs to be reasoned about. I think it will probably also encourage the use of algorithms that have fewer locks, so better performance with larger numbers of CPU cores. (The penalty? Probably that there are more copies of data. The increased size of runtime is peanuts by comparison with that.)

dkf

@Bulb said:

Writing portable code in C++ is pain in the arse, but it is possible, which is more than can be said about any of the other languages.

C has it beat for one type of portability (I don't think anything excels C for its ability to support ABI compatibility and portability) and higher-level languages beat it for other kinds (with larger runtimes, they can do things like hiding more of the differences between operating systems). It's not that C++ couldn't support doing well on these things… but rather that that's not what happens for real.

Bulb

@dkf said:

If you can't have abstractions, you're left with an extremely low-level language, which makes doing high-level things much more difficult.

I want to have abstractions. I just want to have them built on more basic stuff that is also available. So I am not saying it shouldn't have channels and goroutines. I am saying channels should be a generic class in standard library and go should be a functional instead of keyword.

And it would be possible. In fact it's exactly what Rust does. It has similar concepts, but they have built low level framework for lifetime and ownership control and have templates and functors and the thread operations, which are similar to Go, are built from those. In fact they are better, because Rust will ensure the object is either no longer accessed by the thread that sent it or it is properly shared and synchronized. I don't think Go does that.

@dkf said:

C has it beat for one type of portability (I don't think anything excels C for its ability to support ABI compatibility and portability) and higher-level languages beat it for other kinds (with larger runtimes, they can do things like hiding more of the differences between operating systems). It's not that C++ couldn't support doing well on these things… but rather that that's not what happens for real.

Portability and ABI compatibility are completely unrelated things. By portability I mean I can write code and get it running on different platforms. Which requires a lot of conditional compilation, wrapping things in common interfaces and generally a lot of work, but because C interfaces are the lowest common denominator and C++ can use them directly, you can always get things done and the effort is proportional to the size of the project. With higher-level languages if the runtime does not exist for your target platform, you are usually in for too much work to be practical.

ABI compatibility on the other hand is being able to update shared library without recompiling the dependencies. And that is mostly a Unix zealots' fetish. Most business does not give a damn. I mean I am all for it because it does make security fixes in libraries practically doable and those are important, but commercial software avoids it for anything except system libraries. None of the mobile platforms even provides any way to package libraries as separate installable entities for reuse by multiple applications.

Gąska

@dkf said:

If you can't have abstractions, you're left with an extremely low-level language, which makes doing high-level things much more difficult. When the language provides abstractions over that low-level (and yes, that requires a runtime to implement them) then it makes doing high-level things that used that general area easier, so long as one is willing to be bound by the restrictions of the abstraction.

Rust is a counterexample to this entire paragraph.

Fun fact: Rust has channels too. They work very similar (identical?) to Go channels, except it's all in libraries.

Buddy

@tarunik said:

I'd much rather have a language where I don't have a half-CPU half-VM roaming around below me; besides, the Java model doesn't deal well with having to roll your own hardware access, or the needs of bitbanging and bitfiddling.

JOP has no VM; execution time of each bytecode instruction is precisely known.

@Bulb said:

The principles may have advantages, but anything that increases average memory consumption threefold is not going to be better in practice for devices produced in large quantities (where device cost dominates unlike server where development cost does so inefficient runtime easier to program for can win there), anything that introduces semi-random pauses for garbage collection isn't going to make hard real-time deadlines (it is possible to make parallel collector that does not cause them, but it was not done in practice) and any language that has exceptions isn't going to make hard real-time deadlines either (Rust does not have exceptions).

I acknowledge that java does require some overhead per object, and makes some low-level memory optimizations impossible, but I'm not convinced that threefold-on-average is accurate. And I believe that jop's memory model would more than make up for that overhead. Furthermore, I suspect that there are embedded applications where development costs do exceed device costs, or why are there full linux-based application processors used for those purposes? Lastly, jop does have a real-time gc, asI have stated multiple times, and are you saying that c++ cannot be used in a hard real-time system?.

Buddy

I was a bit disappointed that nobody mentioned synchronous programming, but whatever.

dkf

@Buddy said:

I'm not convinced that threefold-on-average is accurate.

It's probably actually about one pointer's worth per object for the vtable pointer equivalent, since Java always uses dynamic dispatch (unless the JIT can prove otherwise). Modern Java implementations don't keep locks on a per-object basis, since most objects never need to be locked. It does make actually doing locking a bit more expensive, but that seems to be an acceptable trade-off.

Bulb

Vtable pointer alone is not the main concern. The main concern is that in C++ most members are by value, but in Java significant portion of them is by reference and that means pointer to the indirect member plus object overhead. Which does include the vtable pointer in objects that wouldn't need it in C++, but there is at least another 8 bytes for garbage collector and locking (that hopefuly are still only 8 on 64-bit, since they are not pointers). Makes something like date, which is just 64-bit integer internally, several times larger. Add the overhead due to delay between release and collection and 3 times is absolutely realistic (and a value that seems to come up in benchmarks).

dkf

@Bulb said:

The main concern is that in C++ most members are by value, but in Java significant portion of them is by reference and that means pointer to the indirect member plus object overhead.

That depends very much on exactly what is being done. The details matter a lot. The default Java collection types are very wasteful, for example. However, if you can get the value sharing up, references can actually save space.

You're spot on for most Java programs and their programmers.

@Bulb said:

8 bytes for garbage collector and locking

I don't know about the GC per-object overhead, but they moved the lock support out into an auxiliary system that doesn't require a pointer in the object. (I think they're using some sort of hash map with the object's ~~identity hash code~~ address as a key.)

@Bulb said:

Add the overhead due to delay between release and collection and 3 times is absolutely realistic

The major issue is how the (typical) generational GC operates, which pushes object release quite a bit later and greatly increases memory overhead. Yet I don't think Java implementations are required to do it that way; it's just a fast method on a few major classes of systems. Where the JIT engine can figure out that an object's lifespan is sufficiently limited (turns out to be true in a lot of cases) then it is free to actually place the object on the stack. JITs, unlike conventional compilers, have the advantage of being able to see what's actually going on; they're post link-phase.

Which is all rather airy pontificating. :) It'd be good to see some benchmarks…

Bulb

@dkf said:

It'd be good to see some benchmarks…

I always used ~~The Great Language Shootout~~The Computer Language Benchmarks Game.

It seems to have fewer benchmarks then it used to and not all have memory consumption value, but if you look at something like this Java vs. C it has 4 values listed as (approximately) 2×, 2×, 3× and 6×. Similar results comparing to other compiled languages like C++, Rust and even comparison to compiled-but-garbage-collected Haskell has the 4 results as 2×, 2×, 4× and 4×.

And then look at Java-C# one: 2×, 2×, 3×, 4×. To me that looks like there is some particular issue with Java and it's current implementation rather then garbage-collected languages in general. And I suspect the lack of value types and the need to box primitive types for collections to be the largest problem here. Which kind of means different language compiling directly to JVM assembly could dodge most of that.

Bulb

@dkf said:

JIT

Jit may be able to put objects on stack (as we've seen here, static compiler may as well), but it won't be able to change it's internal layout.

dkf

@Bulb said:

Jit may be able to put objects on stack (as we've seen here, static compiler may as well), but it won't be able to change it's internal layout.

But they're less committed to the internal layout in the first place. It's the JIT that creates the implementation memory layout, and it can do so at a point when it has all the cards. (I ought to look at how it actually chooses to do that sometime…)

Bulb

Only partially. It can decide on stack frame layouts, but for objects the layout has semantic consequences that the optimizer can't change without complete information. And the JIT can't have it, because it only works locally. Even static optimizers often don't work with that complete information where it can be spread across multiple modules.

dkf

It can change the order of elements within the fields declared within a class, but not in the superclass, except in the case where it has decided to JIT both of them at the same time. (That is quite possible.) The JVM bytecode instructions do not bind a particular order of fields; that's a runtime decision. The only thing that's important is getting the superclass's fields first. (Java's only got single inheritance at this level, so it doesn't have any really complicated cases to deal with.)

The JIT can also know whether there are any extant instances of the classes concerned; that's relatively simple to collect.

JITs aren't like ahead-of-time compilers. They're able to be much more concrete (e.g., they know exactly what CPU revision they're targeting). On the other hand, they also tend to be under a lot more time pressure.

I'm not entirely sure what I'm arguing here. Thinking about work is a barrier to Discourse.

EvanED

@dkf said:

The major issue is how the (typical) generational GC operates, which pushes object release quite a bit later and greatly increases memory overhead. Yet I don't think Java implementations are required to do it that way; it's just a fast method on a few major classes of systems.

Yeah, but they do it that way because it's the fastest general-purpose GC out there. [Citation needed.] That's not to say it's the best in all cases -- e.g. real-time systems may need a GC that has more predictable performance, but those GCs will have much lower throughput.

@dkf said:

It'd be good to see some benchmarks…

The best I know of is Quantifying the Performance of Garbage Collection vs. Explicit Memory Management by Hertz and Berger. (Some people may know Berger from DieHard.)

The methodology is... not great, but it's really hard/expensive to do better, because you need to hire several really good programmers for a fair bit of time.

We compare explicit memory management to both copying and non-copying garbage collectors across a range of benchmarks using the oracular memory manager, and present real (non-simulated) runs that lend further validity to our results. These results quantify the time-space tradeoff of garbage collection: with five times as much memory, an Appel-style generational collector with a noncopying mature space matches the performance of reachability-based explicit memory management. With only three times as much memory, the collector runs on average 17% slower than explicit memory management. However, with only twice as much memory, garbage collection degrades performance by nearly 70%. When physical memory is scarce, paging causes garbage collection to run an order of magnitude slower than explicit memory management.

Bulb

@dkf said:

It can change the order of elements

That's totally irrelevant. To save memory it must be able to inline the elements, i.e. given

final class Distance {
    private final double value;
    // and a lot of methods, of course
}

and

final class Coordinates {
    private final double longitude;
    private final double latitude;
}

it must be able to store

class Whatever {
    // …
    private Distance dist;
    private Coordinate pos;
    // …
}

as if it instead contained simply

    double dist_value;
    double pos_longitude;
    double pos_latitude;

and I don't think it's doable with Java, because there is a semantic difference that is not completely hidden even though the objects are non-polymorphic and immutable. In referentially transparent languages like Haskell the optimizer can do this kind of decision, because it knows it does not change the semantics. But in other languages either by-value inclusion (like in C++) or value types (like in C#) are needed. And without this, the above abstractions are not really viable in parts that matter performance-wise, i.e. if you do something with map and have millions of such objects.

ben_lubar

@Bulb said:

Channels are not primitive concept of the underlying hardware but require significant amount of code in the runtime. A good language provides enough tools so that that runtime can be a library written in the language itself.

A good language is Turing-complete. Go is Turing-complete. Any questions?

Jaloopa

A penguin is black and white. My cat is black and white

ben_lubar

http://benchmarksgame.alioth.debian.org/u64q/performance.php?test=fasta#about

ben_lubar

The original statement was

A good language provides enough tools so that that runtime can be a library written in the language itself.

which I simplified to

A good language is Turing-complete.

As you can see, Go is Turing-complete and can therefore be used to write its own runtime. In fact, most of Go's runtime is written in Go.