Intel making us slow down

lolwhat

You spin me right 'round, baby...

We translated Intel's crap attempt to spin its way out of CPU security bug PR nightmare

boomzilla

@lolwhat Ugh...their "analysis" of the first sentence was utterly retarded. Well, most of the rest of it is, too. I mean, they make a lot of valid points but it makes them look like idiots when their responses don't match what they're responding to. Which I guess is par for the course at El Reg.

Meh. I hate people.

izzion

@boomzilla
But at least they got their scoop by rushing to publish ahead of the coordinated disclosure embargo window. Even if their scoop completely ignored inconvenient data so they could single out one specific "bad boy".

blek

https://www.reddit.com/r/linux/comments/7nu5mx/templeos_isnt_affected_by_recent_intel_bug/

boomzilla

@blek For that matter, users running SSDS should be unaffected, too, because all of their important information is already in index.txt.

ScholRLEA

@izzion Meh. El Reg isn't so much the "Pee-yew York Roast" or "Daily Fail" of IT as it is The Onion; yes, they 'report' true stories, but they seem to mostly be in it for the lulz.

Filed Under: I will refrain from speculation as to what the Not-So-Breit and the Huffnpuff of IT are. That can sty in the Garage.

The_Quiet_One

@boomzilla said in Intel making us slow down:

@blek For that matter, users running SSDS should be unaffected, too, because all of their important information is already in index.txt.

To the contrary. I wouldn't be surprised if SSDS managed to exploit this flaw without even knowing it or intending to. And if the author found out, he'd consider it a glorious achievement that makes it win the showdown by default.

boomzilla

@the_quiet_one said in Intel making us slow down:

And if the author found out, he'd consider it a glorious achievement that makes it win the showdown by default

I mean...the whole point of SSDS is sharing information, right?

ScholRLEA

Not surprisingly, the OSDev forums are in full meltdown mode over this.

EDIT: Including at least one call for Security through Obscurity, and Tilde talk out his ass again. Quelle surprise!

Weng

My favorite part of Spectre is that, if we actually thought about it, we knew it was true the whole time.

I've explicitly assigned "weird speculative execution shit" as the cause for a couple heisenbugs.

Tsaukpaetra

@bulb said in Intel making us slow down:

cloud services will pretty surely be impacted and the losses can be counted quite easily there.

How? Isn't the cloud supposed to be ephemeral so hard you can't even figure out who's virtually next to you?

izzion

@tsaukpaetra
If you can't figure out who's next to you when you're hard, you're

dkf

@tsaukpaetra said in Intel making us slow down:

All I know is that instead of taking 5 minutes to destroy and spin up 16 Azure VMs, it took almost an hour. Will try again tomorrow on the dev resource group just to make sure it wasn't because they were mass-rebooting everyone.

I'd expect that sort of slowdown to be due to the network traffic of starting all those VMs at once. Given the benchmark figures up above in this thread for an application that does nothing but the simplest possible syscall (getting the PID is pretty trivial; the OS has to have the relevant entry in the table of processes to hand in order to process any system call just to figure out the user's general permissions) I'd be deeply surprised at even the most I/O-heavy process getting anywhere close. A 12-fold increase in time for a real workload is something else, and system congestion is a great primary suspect.

ScholRLEA

@izzion That never presented a problem for me...

Filed Under: It was more common that the other guy didn't know who I was, though. The paper bag made identification difficult, and besides, all they would have seen was the back of my head anyway.

Captain

https://www.bleepingcomputer.com/news/security/mozilla-confirms-web-based-execution-vector-for-meltdown-and-spectre-attacks/

Captain

@twelvebaud I thought he meant 5 English Dollars.

HardwareGeek

@scholrlea said in Intel making us slow down:

starting looking up prices for FPGAs and textbooks on Verilog

Yeah, good luck with that.

A modern CPU design isn't going to fit in even a really big FPGA. A Stratix IV FPGA claims to fit a 15 million gate design; even a 10+ year old Core 2 Duo has upwards 40 million, and more recent CPUs are into the hundreds of millions.

What you can fit in is going to be slow. 600 MHz. (I've seen 1.something GHz for a competitor.)

A project of this scale takes a team of dozens of experienced CPU design and verification engineers a year and a half. We had farms of as many as 500 servers running 24/7 for months testing the design to make sure it was going to do what it was designed to do. And this was for northbridge chips, which were definitely the poor stepchildren compared to the resources allocated to CPU projects.

It's a given that for a system that complex, you won't think of all the possible edge cases. That's why randomized testing exists. You make sure you test all the edge cases you can think of, and hope the randomized testing hits enough of the ones you didn't think of. Obviously, an important one got missed.

If you think you and your Verilog textbook can do better, . Heck, I'll even teach you Verilog for the entertainment. (Protip: Use SystemVerilog; it's a strict superset of Verilog; you'll need it for the verification, and a number of the new features are useful in designing the chip itself.)

ScholRLEA

@hardwaregeek I think your sarcasm detector is borked. Just maybe.

Or is this me ing about you not ing? Damn you, Poe's Law!

Filed Under: Another sad example of the rampant philia in our community. I am a rast myself, so I understand.

HardwareGeek

@scholrlea Either way is possible. All of my mental processes are borked by this cold and the resulting lack of sleep. Even with nighttime "so you can sleep" cold medicine, I haven't gotten more than an hour of two of sleep a night so far this year, and even that little bit of sleep is only coming a few seconds at a time — as I start to doze off, then wake with a start because I can't breathe. Upper respiratory infections are the pits.

ScholRLEA

@hardwaregeek Gah, I am on the tail end (I hope) of the same thing, you have my sympathies.

HardwareGeek

@scholrlea I wish I were on the tail end; I'm still in the it's-going-to-get-worse-before-it-gets-better phase. I just rescheduled my business trip the the Land of Perpetual Drizzle because I'm not going to be over this in time.

ScholRLEA

@hardwaregeek I'm sure Blakey will be upset about the change of plans... :face_with_stuck-out_tongue_winking_eye:

HardwareGeek

@scholrlea I don't think he'll really care much — not too far away, but a bit north. Tundralandia, but that part of it isn't really tundra — 4° - 6° and rain all next week. I added the weekend to do some sightseeing while I'm there, but the main sightseeing I want to do is outdoors, and the chilly, rainy weather won't do my health any good.

sockpuppet7

@hardwaregeek These last ones seems to be a class of bug nobody was paying attention until now. I always felt like branch prediction and shit like that were ugly hacks the CPU does for performance. Probably worth it, but still ugly hacks.

Cursorkeys

@hardwaregeek said in Intel making us slow down:

Protip: Use SystemVerilog; it's a strict superset of Verilog

Or just use VHDL, Trying to fix Verilog is like trying to fix Javascript.

blakeyrat

@hardwaregeek said in Intel making us slow down:

A project of this scale takes a team of dozens of experienced CPU design and verification engineers a year and a half. We had farms of as many as 500 servers running 24/7 for months testing the design to make sure it was going to do what it was designed to do. And this was for northbridge chips, which were definitely the poor stepchildren compared to the resources allocated to CPU projects.

Remember HardwareGeek uses the word "we" here because he got prototype chips from Intel to test, because he's just that smart and great and handsome all the times and did I mention how successful he is?

Seriously though, people suddenly bringing in "we" out of nowhere without explaining who "we" is is quickly becoming a huge pet peeve for me.

Jaloopa

@blakeyrat It's no secret that he works in hardware/chip development

blakeyrat

@jaloopa I know; he used to constantly brag about how great he was at it.

But that doesn't answer the question, "who the fuck is 'we'?" Which is not explained. That is my pet peeve.

Jaloopa

@blakeyrat Are you confusing harwaregeek with @TheCPUWizard?

blakeyrat

@jaloopa Oh I am. Goddamnit.

Ok well stupid me. That's what I get for being up at 6:40 AM.

... my point about people introducing a mysterious "we" entity without explaining who "we" is still applies.

ixvedeusi

@blakeyrat said in Intel making us slow down:

"who the fuck is 'we'?"

I might be going out on a limb here, but I'd guess that "we" is "HardwareGeek and some other people he is or was working with"...

Jaloopa

When we run into issues with people not being specific enough in communication, we tend to assume what they meant and run with it

sockpuppet7

@blakeyrat said in Intel making us slow down:

Seriously though, people suddenly bringing in "we" out of nowhere without explaining who "we" is is quickly becoming a huge pet peeve for me.

In this case, it doesn't matter who "we" is, maybe he doesn't want (or isn't allowed) to identify his employer, or doesn't want to brag about working for X.

boomzilla

@sockpuppet7 It's a false alarm, people. Blakey was just trying to be a dick to TheCPUWizard, but old people all look alike to him.

sockpuppet7

@boomzilla these hardware guys all look the same to me

ScholRLEA

@cursorkeys said in Intel making us slow down:

@hardwaregeek said in Intel making us slow down:

Protip: Use SystemVerilog; it's a strict superset of Verilog

Or just use VHDL, Trying to fix Verilog is like trying to fix Javascript.

Hmmn. I mean, I was joking, but I am sort of curious about the topic, so maybe.

While I'm on the topic, though:

I was wondering what you think of the Parallella, and their claimed performance. I have a distinct sense that something isn't adding up there. Not as bad as what Geri thought he could accomplish with his dazzling ignorance of real-world chip design, but they seem to be involved in some wishful thinking in their press releases, and some of the reports about it seem to back that up. I mean, it sounds like a good idea, but it is the same good idea that failed for the Connection Machine and the Transputer back in the day, and actually selling the FPGA implementation as something other than an engineering prototype seems self-defeating. Any thoughts?
Similarly, any comments about the Mill, and the upcoming tape-out of it's FPGA test implementation (or did that get done already, I'm not sure)? Those following it seem split between "It's gonna be epic!" and "It's gonna flop worse than EPIC!", plus a handful who are convinced it's a scam and are looking to see what the real angle is. I get the impression that Godard et al are really convinced they are on to something big, but at the same time realize how big a risk they are taking trying to come up with a new design on a barely any budget and only a handful of full-time employees, and are actually trying to avoid hyping it too much to lessen the impact of a potential face-plant.

Filed Under: Hey, I know, I'll make a manycore OISC that will outperform everything without needing any real design work or careful tuning! It will be cheaper and more powerful than anything else because everyone knows Simple is Best and there's a huge conspiracy to deny it!!!1!!1!!!! I'm Brillant!

sockpuppet7

@scholrlea now I'm curious about what tooling Intel and AMD use to design their chips

masonwheeler

@cursorkeys said in Intel making us slow down:

Or just use VHDL, Trying to fix Verilog is like trying to fix Javascript.

Verilog: The Good Parts

TwelveBaud

@sockpuppet7 Chisel.

anonymous234

@lb_ said in Intel making us slow down:

Spectre affects ALL processors, including Intel, AMD, and ARM. It can't be fixed in software like Meltdown.

https://i.imgur.com/nCzWD5f.jpg

HardwareGeek

@blakeyrat said in Intel making us slow down:

But that doesn't answer the question, "who the fuck is 'we'?" Which is not explained.

Although I generally try really hard to maintain anonymity of both me and employers, I have in the past stated that I used to work for Intel. Clearly, you don't remember reading that as much as I remember writing it. Fair enough, and hardly surprising; I don't remember half of what I've written, and far less of what I've read.

My point was that even a second-class citizen like non-CPU chips (the CPUs are Intel's cash cows; support chips exist at all because, and only to the extent that, they are necessary to sell CPUs) utilized resources far, far beyond that available to somebody who decides to try to design one on his own.

@blakeyrat said in Intel making us slow down:

he used to constantly brag about how great he was at it.

I'm good enough that I can usually continue to get paid for it. Beyond that, I make no claims.

HardwareGeek

@cursorkeys said in Intel making us slow down:

@hardwaregeek said in Intel making us slow down:

Protip: Use SystemVerilog; it's a strict superset of Verilog

Or just use VHDL, Trying to fix Verilog is like trying to fix Javascript.

Eh. I've used both, and I'm not disappointed that Verilog has largely replaced VHDL in the US, although I understand VHDL is still popular in Europe.

HardwareGeek

@sockpuppet7 said in Intel making us slow down:

@scholrlea now I'm curious about what tooling Intel and AMD use to design their chips

Pretty much the same tools the rest of the semiconductor industry uses. There are a handful of vendors for most of the tools, and guessing which specific ones they use, .

Designs are written in Verilog (or SystemVerilog); Intel still used VHDL when I worked there, but I understand that they, like most of the rest of the US semiconductor industry, have switched to Verilog.

The Verilog (or whatever) code is compiled and linked with a framework and runtime environment to make an executable program that can be run (simulated) to test (verify) whether the design does what it is supposed to do. There are three main vendors of the simulation tools, and the most popular verification framework (UVM) is a standard supported by all three vendors. I don't know which of the vendors Intel currently uses — there isn't really a lot of difference between them — but it's a fairly safe bet that they use UVM as the framework, although at one time they used a similar framework that was developed in-house.

The simulation tools have provided code coverage measurement for decades. ~~One~~Two things UVM (SystemVerilog, really; you don't need UVM for it, but you're using it anyway, so...) add~~s is~~ are randomized testing and feature coverage. The design and verification engineers get together and figure out what needs to be tested to prove that the design works as intended. A "coverpoint" is created for each specific condition that needs to be tested, based on a combination of the behavior specified in the detailed product spec and design engineers' knowledge of the design and "I'm concerned something might blow up of this particular sequence of operations were to occur." Since it might take thousands of person-years to create tests for each of these conditions, and because there are certainly some conditions you didn't think of, a few randomized tests are run thousands of times; if necessary, the randomization conditions ("knobs") are adjusted to increase the probability of hitting coverpoints that haven't been hit.

(What happened here is probably a combination of nobody realizing there needed to be a coverpoint for "speculative execution causes a page fault," randomized testing not happening to generate that scenario, and/or not checking whether that scenario caused kernel memory to become visible to an unprivileged process because nobody thought to check for it.)

Depending on manglement's trade-offs between quality and schedule, some coverage may be waived, but in any case, the coverage is going to be something like 99.5% or better before even bad manglement is going to sign off on releasing it to production.

Once there is enough confidence that the logic is correct, the code is run through a different compiler that converts it to gates and connections (netlist) that can be put on the chip. There is one main vendor for this compiler, and it's a reasonably safe guess that they use it. The list of gates that are available to use, OTOH, depends on the specific fab and process in which the chip is to be made. The input code and output gates are formally compared to verify that they are equivalent; I'd be guessing if I tried to say which tool they use for this.

There may or may not be additional simulation of the gate-level netlist to detect certain classes of bugs that don't show up in the Verilog code (particularly, related to power-saving modes), though this is painfully slow, and the industry has been trying to minimize the need for it. If so, they would use the same simulation tool they used for the original Verilog simulation.

The next step is called place-and-route (P&R), and I'm not even sure what tools are available for this; I have only a very passing acquaintance with this (or subsequent) part of the process. This is an iterative process, in conjunction with static timing analysis (STA), which calculates how long it takes for a logic change to get from point A to point B. (Say you're designing a chip to run at 1 GHz; you have a bit less than 1 ns to get from A to B. STA tells you, for each A, whether it can get to each relevant B in time, and with how much time to spare (possibly negative — that's bad).) You move gates and wires around until every path is fast enough; occasionally, this may require some changes to the logic. I know a little more about STA than P&R, but it's obsolete; so I couldn't guess what current-generation tool Intel uses.

Finally, when you're happy with everything, you write a check for $1M-ish and send result of the P&R out to a "mask house" to have the manufacturing tooling made.

HardwareGeek

@sockpuppet7 said in Intel making us slow down:

I always felt like branch prediction and shit like that were ugly hacks the CPU does for performance. Probably worth it, but still ugly hacks.

I don't disagree. My expertise is definitely not CPU design — the most complex CPUs I've ever been directly involved with were simple 8-bit microcontrollers — and I don't understand how any of that stuff works, at least not at anything close to a detailed level.

izzion

https://www.bleepingcomputer.com/news/security/security-flaw-in-amds-secure-chip-on-chip-processor-disclosed-online/

Im eagerly awaiting a profanity laced rant from His Linusness about the horrific product AMD has shoveled on the world...

anonymous234

@izzion said in Intel making us slow down:

AMD Secure Processor

With a name like that, it was guaranteed to have some egregious security flaw.

pie_flavor

@anonymous234 The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong it usually turns out to be impossible to get at and repair.

Bulb

@hardwaregeek said in Intel making us slow down:

What happened here is probably a combination of nobody realizing there needed to be a coverpoint for "speculative execution causes a page fault," randomized testing not happening to generate that scenario, and/or not checking whether that scenario caused kernel memory to become visible to an unprivileged process because nobody thought to check for it.

The issue here is really insidious. The CPU does discard the data it speculatively read when rising the page-fault and they almost certainly did verify that. Semantically, it behaves absolutely correctly.

What they missed—and that is not surprising, since L1 cache is that bit in the middle that magically makes things faster and is otherwise completely transparent and invisible to anybody, right?—is that somebody could recover the data by carefully observing the effect it had on the caches. Side-channel attacks are pretty advanced topic even in the security circles.

anonymous234

@sockpuppet7 said in Intel making us slow down:

@hardwaregeek These last ones seems to be a class of bug nobody was paying attention until now. I always felt like branch prediction and shit like that were ugly hacks the CPU does for performance. Probably worth it, but still ugly hacks.

I think the goal of Itanium was precisely to remove some complexity from the processor and put it back in the compiler. Too bad it didn't go as planned...

heterodox

@izzion said in Intel making us slow down:

https://www.bleepingcomputer.com/news/security/security-flaw-in-amds-secure-chip-on-chip-processor-disclosed-online/

Im eagerly awaiting a profanity laced rant from His Linusness about the horrific product AMD has shoveled on the world...

Huh. Remember above in this post when I said there were no salient CVEs yet against AMD's management plane but probably because attackers/researchers were just targeting the industry leader? Guess they're betting on AMD getting a lot more market share.

It sure as hell wasn't because AMD had better architecture review/a stronger security stance. No ASLR or stack protections in a negative ring? REALLY?!