Mill CPU

TheCPUWizard

@masonwheeler said in Mill CPU:

@thecpuwizard said in Mill CPU:

gold course

Typo - was supposed to be Golf....

TheCPUWizard

@masonwheeler said in Mill CPU:

Sane metaprogramming is done in the target language itself.

Would that not simply be "programming"????

masonwheeler

@thecpuwizard "Simply programming" is code that acts on data. Metaprogramming is code whose data that it acts upon is other code, generally code currently under compilation.

dcon

@thecpuwizard said in Mill CPU:

Just for those not familiar...

rawbits←,⍉(8/2)⊤¯1+ASCII⍳msg
bits←512{⍵↑⍨⍺×⊃0 ⍺⊤⊃⍴⍵}rawbits,512↑1
(¯64↑bits)←,⊖8 8⍴,(64⍴2)⊤⍴rawbits

I prefer to stay unfamiliar, thank you. Oh, and your modem is on the fritz.

TheCPUWizard

@masonwheeler said in Mill CPU:

@thecpuwizard "Simply programming" is code that acts on data. Metaprogramming is code whose data that it acts upon is other code, generally code currently under compilation.

Be careful of the distinctions of GenerativeProgramming, CategoryReflection, and Metaprogramming..

@dcon - Nope, it is working fine:

ScholRLEA

@thecpuwizard said in Mill CPU:

Just for those not familiar...

rawbits←,⍉(8/2)⊤¯1+ASCII⍳msg
bits←512{⍵↑⍨⍺×⊃0 ⍺⊤⊃⍴⍵}rawbits,512↑1
(¯64↑bits)←,⊖8 8⍴,(64⍴2)⊤⍴rawbits

I wonder what the the code formatter would make of that (a paper hat, perhaps?)...

 rawbits←,⍉(8/2)⊤¯1+ASCII⍳msg   
 bits←512{⍵↑⍨⍺×⊃0 ⍺⊤⊃⍴⍵}rawbits,512↑1  
 (¯64↑bits)←,⊖8 8⍴,(64⍴2)⊤⍴rawbits

Yeah, not surprised by that.

ScholRLEA

@sockpuppet7 said in Mill CPU:

@thecpuwizard said in Mill CPU:

@dkf said in Mill CPU:

42 consecutive right angle brackets

YEs, the limit should clearly be no more than 39 :)

Seriously, if one starts doing heavy meta programming, this can happen. Take a look at Andrei Alexandrescu's work...

One of by favorites [NOT related to Andrei] was a program to play all possible games of tic-tac-toe and print out the results in a simple format... Use C++ templates to provide the maximum amount of compile time decision making.... The end result was a C++ executable that consisted of the output of a single literal string!

Suddently I lost all interest in d-lang or anything touched by this Alexandrescu guy

Maybe you should fork it. You could call it k/D.

Filed Under: get thee to a punnery!

boomzilla

@scholrlea said in Mill CPU:

@sockpuppet7 said in Mill CPU:

@thecpuwizard said in Mill CPU:

@dkf said in Mill CPU:

42 consecutive right angle brackets

YEs, the limit should clearly be no more than 39 :)

Seriously, if one starts doing heavy meta programming, this can happen. Take a look at Andrei Alexandrescu's work...

One of by favorites [NOT related to Andrei] was a program to play all possible games of tic-tac-toe and print out the results in a simple format... Use C++ templates to provide the maximum amount of compile time decision making.... The end result was a C++ executable that consisted of the output of a single literal string!

Suddently I lost all interest in d-lang or anything touched by this Alexandrescu guy

Maybe you should fork it. You could call it k/D.

Filed Under: get thee to a punnery!

Or D-lete.

ScholRLEA

@boomzilla said in Mill CPU:

@scholrlea said in Mill CPU:

@sockpuppet7 said in Mill CPU:

@thecpuwizard said in Mill CPU:

@dkf said in Mill CPU:

42 consecutive right angle brackets

YEs, the limit should clearly be no more than 39 :)

Seriously, if one starts doing heavy meta programming, this can happen. Take a look at Andrei Alexandrescu's work...

One of by favorites [NOT related to Andrei] was a program to play all possible games of tic-tac-toe and print out the results in a simple format... Use C++ templates to provide the maximum amount of compile time decision making.... The end result was a C++ executable that consisted of the output of a single literal string!

Suddently I lost all interest in d-lang or anything touched by this Alexandrescu guy

Maybe you should fork it. You could call it k/D.

Filed Under: get thee to a punnery!

Or D-lete.

I am pretty sure that that is what she would do if anyone of the male gender sent her a request to fork, yes.

Filed Under: Requests made in person would probably lead to a guitar broken over the querent's head.

pie_flavor

@thecpuwizard said in Mill CPU:

@dkf said in Mill CPU:

@thecpuwizard said in Mill CPU:

Seriously, if one starts doing heavy meta programming, this can happen.

I've done pretty hefty metaprogramming in my time. I prefer to avoid it with C++ as it gets a bit impenetrable to debug when it goes wrong.

Try understanding (let alone debugging) a 100 character APL program

APL is extremely fun.

bb36e

@thecpuwizard said in Mill CPU:

Vote++ or ++Vote

why not both?

0_1516329921013_9786c356-52ee-4be7-a225-783f99accb91-image.png

Polygeekery

@ixvedeusi said in Mill CPU:

I think this might stand more of a chance because it's so massively different that x86 support is not something anyone with any sense could expect.

YMBNH

Polygeekery

@dkf said in Mill CPU:

@cabrito said in Mill CPU:

AMD keeps only one fab (or zero) last l heard, It contract the fabricación to the spun_off co and also to TMSC or similar, no?

As far as I know, only Intel keep things totally in-house. The economics of these things are pretty brutal.

I could see that. Don't fab facilities have the shelf life of dairy? Back in the day it seemed like AMD was always building new fabs.

ScholRLEA

@polygeekery said in Mill CPU:

@dkf said in Mill CPU:

@cabrito said in Mill CPU:

AMD keeps only one fab (or zero) last l heard, It contract the fabricación to the spun_off co and also to TMSC or similar, no?

As far as I know, only Intel keep things totally in-house. The economics of these things are pretty brutal.

I could see that. Don't fab facilities have the shelf life of dairy? Back in the day it seemed like AMD was always building new fabs.

I am not any expert in an sense, but my impression is 'yes and no'; the issue, as I understand it, is that whenever they update the die process (going from, say, 22nm to 14nm), the changes required often mean that it is simply more cost-effective to build a new facility for the new process rather than refit the old one, especially since they generally will still be producing chips using the prior process for at least another four or five years - often quite a bit longer as the established and debugged process then gets reused for the lower end Celeron-class chips.

IIUC, it isn't needed every time they had a new process, but it is certain common enough.

Also, up until around 2005 or so, the market for desktop and laptop systems was doubling at about the same rate as the chip densities, meaning that they regularly needed to step up their production capacities. Since the increases in sales would usually shadow the die process improvements, it made sense to just build a bigger plant each generation. This part of the equation changed as the second generation mobile devices became the prevalent form of computing (mobile systems were already growing rapidly before then, but the switch from expensive and tricky to use Personal Digital Assistants to the easier and relatively cheaper smartphones and tablets meant that it exploded). Since most of the people who might have bought their first computer were now buying smartphones and tablets instead, most of which used ARM-based CPUs which were often a generation behind Intel and AMD in terms of process but which were better in terms of power consumption, the demand for x86 chips leveled off. As a result, the frequency with which they needed to increase production runs sizes slowed, though the cost of refitting for a new process was still a factor.

In the 1990s and early 2000s, Intel and AMD were both updating their processes every Moore's Law cycle; however, around 2005 or so I think, they started having trouble keeping that schedule, as the rate at which transistor densities started to slow from doubling in roughly 18 months to something closer to 22 months. After some thrashing about, in 2007 Intel switched to a new die process every other cycle; this was the so-called 'tick-tock' cycle, where they would work in two blocks of 12-18 months each. A 'tick' phase would be a die shrink with relatively small changes to the existing die layouts, while a 'tock' would be a broader redesign of the microarchitecture.

By 2014, they started seeing problems in that schedule, especially in the 'tock' phase; the layouts were growing so complex that they were finding 18 months wasn't enough time to optimize the new microarchitecture thoroughly, which led them to add a 'refresh' phase after the previous tock. In 2016 they announced that they would formalize this with a new 'process-architecture-optimization' cycle, where the individual cycles would be 10-15 months but they would use the results of the architecture phase to work out improvements for the optimization phase.

I can't really comment on what AMD has been up to in that same period, but my impression is that for most of that time, they publicly focused mainly on shrinking the die process with few changes in their released products (which lost ground in terms of performance per core, but would have more cores for a lower price than comparable Intel chips), while concurrently working on a leapfrogging microarchitecture design. IIRC, they tried to overtake Intel on that front around 2012, but it was a bust; their second attempt, the current Ryzen chips, are proving to be a lot more successful in that regard.

That, however, is just my impression, and I am probably quite a bit off about it. Comments and corrections welcome.

Polygeekery

@scholrlea you tend to use 4,000 words when 40 would do.

Zecc

@polygeekery This was, however, an interesting wall of text imo.

Cursorkeys

@scholrlea I wonder if they were starting to have issues with the 'Tick' as well. Certainly TSMC were (are still?) having extreme issues with EUV...specifically, their scanner melting every time they tried it.

dkf

@masonwheeler said in Mill CPU:

Metaprogramming is code whose data that it acts upon is other code, generally code currently under compilation.

If you're writing code that writes code, you're metaprogramming. (Yes, this does mean that all compilers are rather in that space too.) The main tricky bits with metaprogramming tend to be ensuring that error messages are sane (so that people can debug code that's been through the metaprogramming grinder) and stopping weird hazards from creeping in due to inappropriate handling of the code. I know that you've got views on this, but sticking strictly to plugging bits into (type-correct) holes does actually limit the degree of rewriting possible (while making that which is done more likely to be correct, so it's very much not all bad).

TheCPUWizard

@dkf said in Mill CPU:

If you're writing code that writes code, you're metaprogramming.

Things start to get interesting when one also includes things like constexpr. If the compiler is actually executing the code during compilation and placing only the results of such execution into the resultant program, then that portion of the original source code is "code treated as data" in a different way than normal compilation.

Polygeekery

@zecc said in Mill CPU:

@polygeekery This was, however, an interesting wall of text imo.

He missed one of the most important bits though. One of the primary reasons that AMD how has old fabs littered all across the planet is that lots of countries, states and localities kept offering them significant tax advantages to build their next fab in their area.

I wonder what happened with all of those old fabs? After AMD discarded them at the end of their useful lives did they go on to stimulate the local economy with other ventures or are they now abandoned?

Also, there are ways to engineer around this need to constantly build new fab facilities. You can employ a type of leapfrog upgrade path where you build two (or more) fabs in the same area and as a process is phased out you refit it for the next gen. Similar methods have been employed in industry for at least a century. All the major automakers produce new cars every few years but they don't build new factories in new towns every time they release a model year upgrade to their vehicles.

The reason that fab facilities were considered to be expendable was due to the tax subsidies and other incentives to build a new facility in a new place. For a while those were the only thing that kept AMD afloat.

masonwheeler

@cursorkeys said in Mill CPU:

I wonder if they were starting to have issues with the 'Tick' as well.

dcon

@boomzilla said in Mill CPU:

@scholrlea said in Mill CPU:

@sockpuppet7 said in Mill CPU:

@thecpuwizard said in Mill CPU:

@dkf said in Mill CPU:

42 consecutive right angle brackets

YEs, the limit should clearly be no more than 39 :)

Seriously, if one starts doing heavy meta programming, this can happen. Take a look at Andrei Alexandrescu's work...

One of by favorites [NOT related to Andrei] was a program to play all possible games of tic-tac-toe and print out the results in a simple format... Use C++ templates to provide the maximum amount of compile time decision making.... The end result was a C++ executable that consisted of the output of a single literal string!

Suddently I lost all interest in d-lang or anything touched by this Alexandrescu guy

Maybe you should fork it. You could call it k/D.

Filed Under: get thee to a punnery!

Or D-lete.

And next up is E. E-lete Yeah!

ScholRLEA

@cursorkeys said in Mill CPU:

@scholrlea I wonder if they were starting to have issues with the 'Tick' as well. Certainly TSMC were (are still?) having extreme issues with EUV...specifically, their scanner melting every time they tried it.

TIL that @wood moderates the forums for ~~Intel~~ASML's EUV scanners (the one used by the scanners themselves, I mean, not the engineers. AIs love cat videos).

EDIT: I probably should have been more attention...

Filed Under: Hey, @ben_lubar, do you have a picture of that suitable for use in a new avatar icon?

dkf

@boomzilla said in Mill CPU:

Or D-lete.

Better than D-lite.

ScholRLEA

Not really related here (directly, anyway), but I have a question or two for the likes of @TheCPUWizard and @HardwareGeek about HyperThreading.

The first question is simply, what is the 'correct' term for what Intel calls "HyperThreading" (i.e., established term used by CPU designers prior to Intel renaming it)? I know that the technique itself goes back to (IIUC) Seymour Cray, but that it wasn't widely used before because it really only makes sense if there is a lot of slack hardware - something that CPU designers usually try to avoid.

But this bring up a bigger question of just what "HyperThreading" really is, because while I am pretty sure I know what it actual involves, I suspect I am missing some crucial aspect of it.

First, permit me to define terms:

Feel free to TD;DR if you think you know what these things mean already, but don't be surprised if you need to go back and check

Thread - a sequence of operations.
Process - a collection of one or more threads which jointly have exclusive access to a set of resources (memory, etc.), where exclusivity is (meant to be) enforced by the hardware and/or an operating system.
Concurrency - Having multiple threads proceeding over a period of time that is less than the sum of the times required for each individual operation.
Multitasking - Switching control of a CPU between multiple threads over a period of time in order to avoid operations having to wait for an unrelated operation to go to completion.
- Cooperative multitasking is where each thread is permitted to proceed until it either must wait on some external signal, or voluntarily cedes the CPU.
- Preemptive multitasking is where the system (hardware or software, usually some degree of both) enforces switching according to an schedule external to the threads, usually by interrupting the current thread when it's allotted time slice is completed.
Time Sharing - using preemptive multitasking to create the illusion of concurrency. This is what most people mean when the talk about a 'multitasking operating system'.
Parallelism - performing independent operations simultaneously, using separate hardware.
- Coupled parallelism is the use of several duplicate units to perform parts of a single task and whose results all later get combined. An example of fine-grained coupled parallelism would be using a fast adder (of which there are several types) rather than either a serial adder (which uses a single 1-bit full adder repeatedly) or a ripple-carry adder (which uses n 1-bit adders to add n bits, but the adder for bit n, where n>0, has to wait on the carry from bit n-1 before proceeding). The typical fast adder (e.g., a carry-lookahead adder or carry-save adder) allows the bits to be added in parallel, and then combines them with the carries separately in some manner afterwards.
- Decoupled parallelism is the use of several different units to perform unrelated tasks, usually as a speed-up for concurrency as an alternative (or in addition) to time sharing for multiple processes (it can be applied to multiple threads within a process, but only if they are genuinely independent).
- External parallelism is the use of multiple complete units which operate either coupled or decoupled. Multiprocessing, where multiple CPUs (either physically separate devices or multiple 'cores' in a single device), is a typical coarse-grained type of external parallelism (conceptually at least, distributed processing can be seen as being an even more coarse form, though it is usually seen as a separate topic). An example of fine-grained external parallelism would be an MIMD vector processor such as the Cray XMP.
- Internal parallelism is use of different subsystems of a processor to perform different operations at once, for example, pipelining of instructions is coarse-grained internal parallelism in which stages such as the decoder, memory fetch, ALU, and memory write all work on different instructions in a sequence rather than each waiting for the previous operation to complete.

OK, now that that is out of the way, here is my understanding of 'Hyper-Threading': basically, it is a hardware system that uses the slack in a given core's operation (e.g., a pipeline stall that couldn't be otherwise avoided, or a multi-cycle ALU operation such as a divide where there are no other operations which the thread can perform in parallel) to automatically multitask the flow of instructions through the core's CPU, without having to invoke a software-level scheduler.

Let me explain what I mean. Basically, there are times where no amount of instruction re-ordering can prevent different parts of the core - the decoder, the memory fetch, the ALU, etc. - from having to wait on some other part. Hyperthreading basically is decreasing the coupling of those parts to let, say, the decoder work on a one thread which isn't going to use the same parts of the ALU that the one the ALU is already working on is. It is a very fine-grained sort of multitasking of the parts of the CPU itself, rather than of the CPU as a whole.

Is this a correct interpretation of it? Because if so, it seems to me to highlight just how problematic the x86 architecture really is, and just how heroic - and expensive - the effort to keep the current software running really is (which is what it really is all about - as I have said repeatedly, even Intel doesn't want the x86, and they never really did, but they are riding a tiger and have no alternative than to keep it going as long as possible). It is a jacked-assed, but necessary, solution to precisely the kind of problem that other designs - even many older designs - simple avoid, but which couldn't be in this case because of the need to keep the existing software base alive a little longer while they and everyone else try to find a palatable way out of this mess without causing the whole structure to implode.

PleegWat

@scholrlea As I understand it, it's mainly about getting the multiple ALUs, load/store units, etc populated as much as possible. And the reason for having so many function units in the first place is to boost single-thread performance.

Just building two cores with fewer functional units each would lead to better performance for fitted (multithreaded) workloads, but hurt the single-threaded workloads with were heavily dominating at the time hyperthreading was introduced.

ScholRLEA

@pleegwat Hmmn, OK, then, I was rather off the mark, in that case.

Mind you, that leads me to think that it is a hardware solution to a software problem, or rather, to several cultural problems (the resistance to using multithreading which was, and to a large part remains, common among application devs; the perceived difficulty of debugging multithreaded apps; the lack of proper training and education on how to write and debug them effectively; the continued use of languages with no built-in threads and no standard library support for threading, or only minimal support based on inadequate models; etc.).

PleegWat

@scholrlea I'm not afraid of multithreading, but then I tend to face embarrassingly parallel workloads. I tend to land at some form of message-passing.

But yeah, single threaded workloads are easier for many people to understand, and an existing single-threaded application (particularly one that isn't embarrassingly parallel, or whose code is not well-understood anymore) is hard to convert to a multithreaded application later on without a significant rewrite.

TheCPUWizard

@ScholRLEA - Your understanding is close enough for a fairly detailed discussion [and getting down to transistor level requires volumes of information]...so lets go with it! :)

@Pleegwat - You hit one of the two most important points [and I would actually make it #1]. The preponderance of single-threaded CPU Bound code that existed at the time (and alas, continues to exist).

The secondary consideration is that with true multi-core where each core has hyper-threading is the ability to set affinity per process as well as enable/disable hyper-threading. While for most situations this is more tedious to deal with than people want, for key scenarios in high performance, it can be critical.

Consider a 2-CPU with Quad Core, with HT... This will show as 16 "processors". Now presume there is one core process (well written for multi-threaded)... Turning off HT for one CPU and setting the affinity for that process to the resultant 4 "CPU"s can have big improvement!!!!

masonwheeler

@thecpuwizard Did you seriously just say that one of the biggest advantages of hyperthreading is that you can turn it off? :P

Polygeekery

@masonwheeler shut up Wesley.

Parody

@scholrlea "Intel® Hyper-Threading Technology" was also a big marketing gimmick at the time it was introduced. Get nearly two core performance with one core!

Wikipedia says the term you were looking for is "simultaneous multi-threading". (Arrange dashes as you prefer.)

ScholRLEA

@parody Thank you.

blakeyrat

@scholrlea HyperThreading would probably not even be a thing right now if languages like C# or Go (which make multi-threading so easy it's stupid not to do it) were around 5 or 10 years earlier. Because you're basically right; it makes sense because 90% of software is strictly linear.

dkf

@scholrlea A few points:

We've now got widespread deployment of true multicore machines; systems capable of true parallelism (instead of just time-sharing) are on everyone's desk and in everyone's phone. With true parallelism, you really can't get away with many tricks; another core really could change the world under your feet.
Multithreaded programming is resisted because most people do it by sharing memory between threads. That means that most multithreaded programs have to deal with the possibility of virtually everything getting changed behind their back, all the time, and that's really hard to wrap your head around. Debugging such code is also insanely difficult because the adding of instrumentation (either internal — logging of various kinds — or external — a debugger or other such tool) changes the timing and the pattern of accesses.
The sane approach to dealing with this is to drop the shared-memory model and instead use message passing. Message passing scales far more effectively (heck, the whole internet can be viewed as a sort of massive message passing system), is a lot easier to map to what the semantics of parallelism talk about, etc. There is one major down-side to doing this though: data gets copied a lot more (as that makes determining ownership rules enormously simpler).
All forms of parallel programming have failure modes that single-threaded programs do not. In particular, there are the possibilities for systems to deadlock or live-lock due to the interaction between all the threads in the system yet for each component thread or (non-covering) subset of threads to be correct and free of such hazards. Debugging such things is much harder precisely because they are necessarily a bug of the overall system, i.e., it is a fault of global properties of things and not of the components per se, and global properties are enormously more difficult to comprehend than local ones. (The tools I know of for dealing with these issues have major challenges with tractability; the underlying algorithms tend to be EXPSPACE or worse.)

Dining philosophers problem - Wikipedia

In short, parallel programming is genuinely hard. Shared memory parallel programming is much harder, yet it is the form that so many programmers encounter first. (Also, parallel programming in Python is just nasty. The language implementation demonstrates how not to do it.)

ScholRLEA

@dkf said in Mill CPU:

The sane approach to dealing with this is to drop the shared-memory model and instead use message passing. Message passing scales far more effectively (heck, the whole internet can be viewed as a sort of massive message passing system), is a lot easier to map to what the semantics of parallelism talk about, etc. There is one major down-side to doing this though: data gets copied a lot more (as that makes determining ownership rules enormously simpler).

QFT. There are actually a number of models possible, of which message passing (or, to kick it up a notch, a full Actor model such as used in Erlang) is one of the best approaches, and conceptually one of the easiest.

Unfortunately, the most commonly used method, explicit synchronization of shared resources using semaphores or mutexes, is perhaps the worst, short of a 'no rules, every man for himself' approach - especially since it can often silently degenerate into that if the programmers aren't on the ball.

Fuck Pthreads. That library is a plague on the industry.

masonwheeler

@dkf said in Mill CPU:

The sane approach to dealing with this is to drop the shared-memory model and instead use message passing. Message passing scales far more effectively (heck, the whole internet can be viewed as a sort of massive message passing system), is a lot easier to map to what the semantics of parallelism talk about, etc.

I would actually use the Internet to make exactly the opposite point: the Web, especially sites like this one, demonstrate quite starkly how message passing is not inherently better at avoiding race conditions and parallelism problems.

Anytime someone gets 'd on here, or types up a reply to someone pointing out a problem in their post only to see once they post it that the other person has already edited it to fix it, that's essentially a human version of a race condition, and things like that happen all the time.

ScholRLEA

@masonwheeler Kind of a non sequitur, there. No synchronization method will help on an unguarded resource (which in this case would presumably by the thread of the conversation, or maybe the attention of the readers). Write forum software that actually tries to enforce mutual exclusion on forum threads without dropping posts or forcing an ordering on the users (talking stick, anyone?), then we can talk.

Also: internally, you can bet your ass that the database is enforcing it, but that isn't the resource you're discussing.

blakeyrat

@scholrlea said in Mill CPU:

Also: internally, you can bet your ass that the database is enforcing it,

Oh shit. On NodeBB?

I'll take that bet. Holy shit I'll take that bet. Easy fucking money.

ScholRLEA

Crap, you have a point there.

Seriously, though, that sort of thing isn't even in the forum software, it is in the database engine. IDHAFC what this forum uses for that, or even if NodeBB can work with more than one kind.

dkf

@scholrlea said in Mill CPU:

a full Actor model such as used in Erlang

That really doesn't add as much as all that.

dkf

@masonwheeler said in Mill CPU:

I would actually use the Internet to make exactly the opposite point: the Web, especially sites like this one, demonstrate quite starkly how message passing is not inherently better at avoiding race conditions and parallelism problems.

That's entirely stupid and wrong. Without that, you'd need to ensure that only one person can add posts to a thread at a time, and that nobody can read a message in a thread if that message is being edited by someone else. That'd give you a system without concurrency hazards, but also with virtually no concurrency at all…

heterodox

@scholrlea said in Mill CPU:

IDHAFC

MongoDB out of the box (BAD bet), and Ben is working on getting it working with Postgres (slightly better bet but not really).

HardwareGeek

@cabrito said in Mill CPU:

@dkf said in Mill CPU:

@scholrlea said in Mill CPU:

Their main goal is to produce their own chips, but they are looking to sell the IP to fabs if they can't afford that.

Unless they've got a few billion to build their own fab, they'll be in the IP business.

AMD keeps only one fab (or zero) last l heard, It contract the fabricación to the spun_off co and also to TMSC or similar, no?

But that is a model that works for lots of profitable semiconductor companies. TSMC and other foundries spend the big bucks to build the fabs, and every chip company pays the foundries to make chips for them.

@dkf said in Mill CPU:

As far as I know, only Intel keep things totally in-house.

They're not the only ones, but there aren't many others. Not many companies have billions of dollars available to build fabs.

HardwareGeek

@scholrlea said in Mill CPU:

Not really related here (directly, anyway), but I have a question or two for the likes of @TheCPUWizard and @HardwareGeek about HyperThreading.

Sorry, but I'm the wrong person to ask about this. I have a fairly basic understanding of CPUs; my experience and expertise (such as it is) is with memory and peripheral interfaces, SoCs with embedded (mostly) low-end CPU cores (which we tended to treat as black boxes, with the assumption that ARM or whomever we licensed the IP from gave us something that worked), telecom, and other non-CPU chips. Even though I did work for Intel, I know less about HyperThreading than you seem to.

HardwareGeek

@dkf said in Mill CPU:

The sane approach to dealing with this is to drop the shared-memory model and instead use message passing. Message passing scales far more effectively (heck, the whole internet can be viewed as a sort of massive message passing system)

I don't think your example supports your premise.

Edit:
@masonwheeler said in Mill CPU:

Anytime someone gets 'd on here,

As you were saying...

dkf

@hardwaregeek said in Mill CPU:

I don't think your example supports your premise.

I think it does, but it's definitely the case that message passing scales up larger as it can scale across process and processor boundaries. (That's the key to how MPI works, to list just one.) Systems with a thousand cores with shared memory across all of them have been built in the past, but the backplane requirements to make that work are sufficiently esoteric that nobody builds supercomputers like that any more; shared memory scales up what fits on one motherboard only, max, unless you pay $stupid_money.

boomzilla

@dkf said in Mill CPU:

@hardwaregeek said in Mill CPU:

I don't think your example supports your premise.

I think it does, but it's definitely the case that message passing scales up larger as it can scale across process and processor boundaries. (That's the key to how MPI works, to list just one.) Systems with a thousand cores with shared memory across all of them have been built in the past, but the backplane requirements to make that work are sufficiently esoteric that nobody builds supercomputers like that any more; shared memory scales up what fits on one motherboard only, max, unless you pay $stupid_money.

See sig. He was just making an "internet is crazy" joke.

Lorne Kates

@boomzilla said in Mill CPU:

See sig. He was just making an "internet is crazy" joke.

Are you saying the text after his post is the reason why the joke failed?

So it's SIGFAULT?

igodard

@ixvedeusi Ivan Godard, the man in the Mill videos, here.

AMA