WTF Bites


  • Banned

    @Zerosquare said in WTF Bites:

    @Gąska said in WTF Bites:

    The most critical question is: when was that?

    Not more that a few years ago, at least. I could ask him if he still does nowadays.

    Now you've got my attention. It goes against literally everything I've seen and read about in the last five years.

    Not to mention that there are lots of other targets besides x64 when you look outside of the desktop PC/servers ecosystem.

    When you look at current-gen gaming, there really are no other targets besides x64. If you said the same story about any other architecture - like even last-gen consoles - I would never think about questioning you even for a moment. x64 is special because it had several orders of magnitude more man-hours put into optimizing compilers than pretty much any other architecture.

    On systems running on small batteries, you're not necessarily looking for the fastest way to do things, but for the most energy-efficient way. If you think "I'll let the compiler figure it out, I don't have to bother with this low-level stuff", you're going to be disappointed by the results.

    Never really thought of that. I admit I lack enough experience to make definitive statement, but I can't really think of any scenario where optimizing for instruction cycles doesn't equal optimizing for power efficiency (when talking about CPU alone, not any peripheral devices). Could you provide an example or two, out of pure curiosity?


  • Considered Harmful

    @HardwareGeek said in WTF Bites:

    @pie_flavor said in WTF Bites:

    @Cursorkeys said in WTF Bites:

    But there was gnashing of teeth over the inefficiency of not hand-crafting the ASM :)

    Those people are idiots. C code is probably better than hand-crafted ASM because people don't know how to be efficient.

    Did you even read the post you're replying to?

    No, why should I do that?

    @Cursorkeys wrote "I started my career ... the first C compilers". Yes, compilers are very good at generating well optimized code now. 30+ years ago, not so much.

    Really? How bad was it? I thought the point of C was that its structures efficiently map to machine code.


  • 🚽 Regular

    @HardwareGeek said in WTF Bites:

    Is that what causes that message?

    Unfortunately a whole bunch of things. In my experience it's just best to rejoin the domain.

    @HardwareGeek said in WTF Bites:

    The client for whom I'm working has a bunch of machines in another building across town that are currently unusable because of that, but nobody has access to them.

    I guess you mean they can't logon as a local admin? In which case just enabling the built-in Administrator account with a blank password is the easiest course of action. I like this for doing that task, quick and reliable: http://pogostick.net/~pnh/ntpasswd/


  • Banned

    @pie_flavor said in WTF Bites:

    @HardwareGeek said in WTF Bites:

    @Cursorkeys wrote "I started my career ... the first C compilers". Yes, compilers are very good at generating well optimized code now. 30+ years ago, not so much.

    Really? How bad was it? I thought the point of C was that its structures efficiently map to machine code.

    Just mapping to machine code doesn't give you efficiency. As of how bad it was, try running some benchmarks with -O0 flag.



  • @Cursorkeys said in WTF Bites:

    I guess you mean they can't logon as a local admin?

    I mean can't connect at all over the network, AFAIK, and no physical access to the building. One guy does, but he's too busy to drive across town and "babysit" machines we don't have an urgent need for.


  • 🚽 Regular

    @HardwareGeek said in WTF Bites:

    @Cursorkeys said in WTF Bites:

    I guess you mean they can't logon as a local admin?

    I mean can't connect at all over the network, AFAIK, and no physical access to the building. One guy does, but he's too busy to drive across town and "babysit" machines we don't have an urgent need for.

    Bugger, physical access makes this much easier.
    If you do have a local admin account available then you can rejoin them to the domain with a batch file and psexec. Or use psexec to install VNC or something.

    If you don't have a local admin account already available then I can't think of anything at the moment...


  • Discourse touched me in a no-no place

    @Gąska said in WTF Bites:

    Never really thought of that. I admit I lack enough experience to make definitive statement, but I can't really think of any scenario where optimizing for instruction cycles doesn't equal optimizing for power efficiency (when talking about CPU alone, not any peripheral devices). Could you provide an example or two, out of pure curiosity?

    Cost of register-register moves is cheaper (fewer gates switched) than memory-to-register moves. Also, memory-to-register moves may take longer; the time cost there may be defrayed by instruction interleaving, but compilers have very… mixed… success at that. (For sure, gcc is suboptimal at it. It finds ways to make operations that can be done in 6 cycles take 30 cycles or more; we've measured that, and for a critical inner loop in an interrupt handler, that's bad. Armcc is quite a bit better, but a deployment PITA due to the license server requirements and is very much not something our users have.)

    Accurate energy measurements are difficult to get, BTW. Sampling problems can really screw things up.



  • @Gąska said in WTF Bites:

    x64 is special because it had several orders of magnitude more man-hours put into optimizing compilers than pretty much any other architecture.

    Special x64-things also have fairly good support via intrinsics. So, one can avoid inline asm via intrinsics and let the compiler deal with tedious stuff like register allocation and the final scheduling of the individual instructions. It gets it mostly right, at least if you massage the code with the intrinsics a bit.

    The last time I could justify using inline ASM was for doing Kahan summation. The problem was that the compiler would optimize it away under --ffast-math, which we did want for the rest of the code (and you can't change the optimization settings inside a function).

    (FWIW: clang does understand and will optimize your inline assembler if you don't mark it with volatile.)

    Never really thought of that. I admit I lack enough experience to make definitive statement, but I can't really think of any scenario where optimizing for instruction cycles doesn't equal optimizing for power efficiency (when talking about CPU alone, not any peripheral devices). Could you provide an example or two, out of pure curiosity?

    Hmm, that's actually an interesting question. Optimizing for power consumption is tricky. There are instructions that are more power hungry than others. There's apparently an interesting thing with Intel CPUs -- it seems some AVX2/AVX512 instructions cause the CPU to cap the maximum frequency of itself due to higher power consumption (=more heat). I can't find the original article where I read about it, but there's a post by the Cloudflare guys regarding this. Question is whether or not avoiding those instructions lowers the overall power consumption for the whole task or not (since it may now instead take more instructions to complete).

    Keeping your CPU at a lower frequency will also help with lowering power consumption. But that doesn't really answer your question either, since the total number of cycles remains the same, they are just spread out over a longer time.

    Do you consider RAM a peripheral device? If no, then avoiding hitting RAM is very likely worth paying more CPU cycles for. I think that extends to caches as well, although the difference will be much smaller (no external bus to go across).


  • Discourse touched me in a no-no place

    @cvi said in WTF Bites:

    Optimizing for power consumption is tricky.

    It's the sort of thing you don't normally bother with for desktop systems, but which is super-important for embedded systems and supercomputers, the former because you've not got much power available at all, and the latter because you're in an environment so fast that overall power consumption (and heat generation, which is basically the same thing) is the usual real limiting factor. Every milliwatt in has to go somewhere, and that somewhere is heat.



  • @dkf Yeah, true. Although there is some vague push for it in the desktop space, coming there via a detour by smart-phones, tablets and other "lighter" devices. In some cases, it's not so much about saving power overall, but rather about not hitting thermal limits (e.g., for some ARM-based devices you're guaranteed to run into a bunch of thermal throttling if you try to run the CPU and GPU at full tilt at the same time).

    I've not actually heard the software side of "supercomputing" talking too much about power efficiency, but that's maybe mostly because the projects that I've been in have had ... more immediate problems. On the hardware side, they're definitively talking about it, but to me it seems that that's still mostly on the level where e.g. using GPUs (or GPU-like stuff such as the KNL et al.) over CPUs will give you more flops per watt. (OK, I know you're on some super fancy specialized hardware, and a few others have that as well, but us normal mortals have a bit less choice on that end.)



  • @Gąska said in WTF Bites:

    Now you've got my attention. It goes against literally everything I've seen and read about in the last five years.

    I'll ask him for details. When he told me about that, the PS3 was still current-gen, but I believe he did the same thing for then-current x64 systems as well.

    @Gąska said in WTF Bites:

    Never really thought of that. I admit I lack enough experience to make definitive statement, but I can't really think of any scenario where optimizing for instruction cycles doesn't equal optimizing for power efficiency (when talking about CPU alone, not any peripheral devices). Could you provide an example or two, out of pure curiosity?

    It's hard to give specific examples, because a lot of code in embedded devices is not "pure CPU" ; typically you offload as much as you can to on-chip peripherals, because it's usually both faster and more power-efficient.

    It's also hard because power usage is rarely (if ever) documented in detail, and measuring it with instruction- or cycle-granularity is far from trivial.

    But here's a potential example: some CPUs are now fast enough that caching the results of simple operations (like multiplications) can be counter-productive due to memory latency. But even if redoing the calculations every time you need the result is faster, I wouldn't be surprised if it also more power-intensive.


  • Banned

    @dkf said in WTF Bites:

    @Gąska said in WTF Bites:

    Never really thought of that. I admit I lack enough experience to make definitive statement, but I can't really think of any scenario where optimizing for instruction cycles doesn't equal optimizing for power efficiency (when talking about CPU alone, not any peripheral devices). Could you provide an example or two, out of pure curiosity?

    Cost of register-register moves is cheaper (fewer gates switched) than memory-to-register moves. Also, memory-to-register moves may take longer; the time cost there may be defrayed by instruction interleaving, but compilers have very… mixed… success at that. (For sure, gcc is suboptimal at it. It finds ways to make operations that can be done in 6 cycles take 30 cycles or more; we've measured that, and for a critical inner loop in an interrupt handler, that's bad. Armcc is quite a bit better, but a deployment PITA due to the license server requirements and is very much not something our users have.)

    Yeah, but it doesn't really answer my question of how more clock cycles can lead to less energy consumption.

    Accurate energy measurements are difficult to get, BTW. Sampling problems can really screw things up.

    That too. Which makes it even more interesting that someone might optimize for energy rather than cycles.

    @cvi said in WTF Bites:

    The last time I could justify using inline ASM was for doing Kahan summation. The problem was that the compiler would optimize it away under --ffast-math, which we did want for the rest of the code (and you can't change the optimization settings inside a function).

    Hm... couldn't you extract this function to separate file, and put different settings for it in makefile? Anyway, you'll probably find it interesting that GCC 4.4 added exactly this feature in form of #pragma GCC optimize(...) and __attribute__((optimize(...))).

    (FWIW: clang does understand and will optimize your inline assembler if you don't mark it with volatile.)

    Hm. Interesting. Sounds like a bit of misfeature?

    Never really thought of that. I admit I lack enough experience to make definitive statement, but I can't really think of any scenario where optimizing for instruction cycles doesn't equal optimizing for power efficiency (when talking about CPU alone, not any peripheral devices). Could you provide an example or two, out of pure curiosity?

    Hmm, that's actually an interesting question. Optimizing for power consumption is tricky. There are instructions that are more power hungry than others. There's apparently an interesting thing with Intel CPUs -- it seems some AVX2/AVX512 instructions cause the CPU to cap the maximum frequency of itself due to higher power consumption (=more heat). I can't find the original article where I read about it, but there's a post by the Cloudflare guys regarding this. Question is whether or not avoiding those instructions lowers the overall power consumption for the whole task or not (since it may now instead take more instructions to complete).

    That's all pretty interesting. I'll sure read this article when I get some time.



  • @Gąska said in WTF Bites:

    Hm... couldn't you extract this function to separate file, and put different settings for it in makefile? Anyway, you'll probably find it interesting that GCC 4.4 added exactly this feature in form of #pragma GCC optimize(...) and attribute((optimize(...))).

    Yeah, considered that. On one hand, I wanted to keep the sum together with the rest of the function (it's computing stuff and summing it on the fly; the sum is only a small part of the whole computation). I don't think you're allowed to use the pragma inside a function and the attribute goes onto a function either way.

    Might be fixable with some combination of inlining and pragma/attribute, but in the end, it was only a handful of instructions and actually ended up being somewhat portable (GCC, clang and icc all support the same asm syntax).


  • Notification Spam Recipient

    @Cursorkeys said in WTF Bites:

    If you don't have a local admin account already available then I can't think of anything at the moment...

    You need more practice, hacker friend...



  • @Gąska said in WTF Bites:

    Yeah, but it doesn't really answer my question of how more clock cycles can lead to less energy consumption

    The avx 512 example is perfect. It's current gen, and 'easily' testable.
    I don't know how to explain it. I'll try a car analogy for fun.
    Say you have two engines. One can output 1hp(x64), the other 2hp(avx512).
    When you use the 2hp engine you obviously get done faster, but wait! You actually have 4 engines next to each other.
    But you still have a powerbudget of 4hp(4x1). This means that you can only run 2x2 hp engines simultaneously. Worse, because electronics scale heat/frequency/voltage exponentially, you have less than 4x1 available, when you use the 'powerfull' engines (because they heat up more)
    Say a 1hp engine wastes 10%. Then a 2hp engine might waste 30, or 50%.
    I probably fucked the analogy totally, but I had fun writing it. So it stays.



  • 0_1538092150894_f01846bb-606c-41ac-a865-997599becd07-image.png


  • Notification Spam Recipient

    @hungrier said in WTF Bites:

    0_1538092150894_f01846bb-606c-41ac-a865-997599becd07-image.png

    YouTube things you really don't want to miss them.


  • Discourse touched me in a no-no place

    @cvi said in WTF Bites:

    (OK, I know you're on some super fancy specialized hardware, and a few others have that as well, but us normal mortals have a bit less choice on that end.)

    We're a weird hybrid between embedded and supercomputer. This lets us work around a whole boatload of conventional hard problems handily, but introduces a bunch of other difficulties instead. 😆

    In practice, we mostly care about keeping the cycle count down in key inner loops as that has very large impacts on how things scale; there's a core loop where taking a month to cut it by even half a cpu cycle per iteration is worthwhile. We also care a lot about getting the numerics as right as possible with as large a step size as possible, as that enables realtime operation. Low power just pops out of how the hardware architecture works, and we're most efficient the more we can pack things together; that overwhelms all other considerations in all non-trivial cases. (The trivial cases use so little power that nobody cares. Except when trying to measure for publications and crap like that.)


  • Discourse touched me in a no-no place

    @Gąska said in WTF Bites:

    Yeah, but it doesn't really answer my question of how more clock cycles can lead to less energy consumption.

    Power consumption is basically identical with the number of times a gate is switched state (as the gates are similar capacitances and voltages are constant to a first approximation). That's not fully true, but it does give you the right sort of scaling factors. Now remember that chips mostly don't switch gates that they don't need to, especially ones for complex instructions.


  • Notification Spam Recipient

    https://imgur.com/gallery/bQcIfpV

    I wish I still had that commercial...


  • BINNED

    Code, I know you're trying to help, but now you're just bullshitting me:

    0_1538122676047_463b253a-7edc-469a-9aaf-7fc5c5a8dbd5-image.png

    Really? It does? Okay, sure, it's not as if this is just something I tacked onto the end of my manifest.json file to disable it temporarily for testing purposes...


  • Discourse touched me in a no-no place

    @Onyx The Marketplace has opportunities for all files.


  • Considered Harmful

    @Onyx A homework assignment was to make a FASTQ reader and FASTA writer. Apparently there's an extension for those too.


  • Java Dev

    An inadequate flow of blood to a part of the body may be caused by any of the following:

    • ...
    • Tourniquet application

    Who would have thought that a device made for the express purpose of disabling blood flow to a limb will cause inadequate blood flow!


  • Banned

    @Atazhaia sometimes it reduces the overly high blood flow to adequate levels. Though that usually happens in the lack of limb.


  • Discourse touched me in a no-no place

    Amazon seems to have automatically created me some virtual Dash Buttons I can use on its website to reorder my favourite products.

    I've never really understood the benefit of them at all, as it takes like a minute max to order stuff from Amazon if you already know what you want - but that's besides the point here.
    Most of the buttons it created for me are for things I may reorder - engine oil, AA batteries, etc but one of them, when clicked, will order this:
    0_1538144754610_93b8385d-11c7-456f-994e-a920fa099c1e-image.png

    How many fucking times does it think I'll need to reorder a 32mm impact socket that this would be of any use?



  • @loopback0 It once auto-created a Dash button for a benchtop DC power supply I was looking at. With the option to auto-purchase one every month. That didn't really inspire me with confidence in the quality or longevity of the power supply.



  • @loopback0 said in WTF Bites:

    How many fucking times does it think I'll need to reorder a 32mm impact socket that this would be of any use?

    Depends if it's made in China 🐠



  • 0_1538148976046_926ea399-6ce9-46b0-b9e1-bd2c0d7ee0ad-image.png


  • Banned

    @ben_lubar your IT administrator really wants you to make this scan.


  • Notification Spam Recipient

    @loopback0 said in WTF Bites:

    fucking times

    I was going to make a :giggity: comment but it was getting a little too repetitive...


  • Considered Harmful


  • Considered Harmful

    Update the list of voters, but not the number of voters. Yeah, that makes tons of sense.
    https://i.imgur.com/hyiWeQ7.png
    https://i.imgur.com/pd4u3Pd.png
    https://i.imgur.com/jgepDoX.png


  • ♿ (Parody)

    @pie_flavor said in WTF Bites:

    Update the list of voters, but not the number of voters. Yeah, that makes tons of sense.

    Ben broke it.



  • @boomzilla said in WTF Bites:

    @pie_flavor said in WTF Bites:

    Update the list of voters, but not the number of voters. Yeah, that makes tons of sense.

    Ben broke it.

    accurate.

    I should figure out what the hell is going on with my pubsub implementation.



  • https://www.nytimes.com/2018/09/28/technology/facebook-hack-data-breach.html
    🍿



  • @Gąska said in WTF Bites:

    Go on, try writing PR for GNU Make. I dare you.


    This anecdote is not about GNU flavor of make, but I love it anyhow:

    Way back in the early 1980s, before each of the bugs in Unix had such a large cult following, a programmer at BBN actually fixed the bug in Berkeley’s make that requires starting rule lines with tab characters instead of any whitespace. It wasn’t a hard fix—just a few lines of code.
    Like any group of responsible citizens, the hackers at BBN sent the patch back to Berkeley so the fix could be incorporated into the master Unix sources. A year later, Berkeley released a new version of Unix with the make bug still there. The BBN hackers fixed the bug a second time, and once again sent the patch back to Berkeley.
    …The third time that Berkeley released a version of make with the same bug present, the hackers at BBN gave up. Instead of fixing the bug in Berkeley make, they went through all of their Makefiles, found the lines that began with spaces, and turned the spaces into tabs. After all, BBN was paying them to write new programs, not to fix the same old bugs over and over again.
    (According to legend, Stu Feldman didn’t fix make’s syntax, after he realized that the syntax was broken, because he already had 10 users.)

    Source: The Unix-Haters Handbook



  • @boomzilla said in WTF Bites:

    @pie_flavor said in WTF Bites:

    Update the list of voters, but not the number of voters. Yeah, that makes tons of sense.

    Ben broke it.

    Should be fixed now.


  • Notification Spam Recipient

    @ben_lubar said in WTF Bites:

    now.

    Alright, Upvote me in two minutes.


  • BINNED

    @Zerosquare said in WTF Bites:

    0_1538169174893_canvas.png

    Really? Who'da thunk it.


  • Notification Spam Recipient

    @Tsaukpaetra said in WTF Bites:

    @ben_lubar said in WTF Bites:

    now.

    Alright, Upvote me in two minutes.

    Nope...


  • Considered Harmful

    @ben_lubar said in WTF Bites:

    @boomzilla said in WTF Bites:

    @pie_flavor said in WTF Bites:

    Update the list of voters, but not the number of voters. Yeah, that makes tons of sense.

    Ben broke it.

    Should be fixed now.

    https://giphy.com/gifs/5hvDHuJ6uH6mgvSJZ5


  • Considered Harmful

    It just keeps going, too.
    https://i.imgur.com/8O7rzaP.png



  • @pie_flavor Meh. That's a clock synchronization issue between your device and the server. The time difference is calculated on the client, and if the client device's clock is slow, the timestamp from the server may be in the future from the client's perspective.



  • @pie_flavor said in WTF Bites:

    Update the list of voters, but not the number of voters. Yeah, that makes tons of sense.
    https://i.imgur.com/hyiWeQ7.png
    https://i.imgur.com/pd4u3Pd.png
    https://i.imgur.com/jgepDoX.png

    The list of voters is fetched when you mouse over it. It doesn't update the count at the same time.

    The count should update live, when it changes, but it doesn't. But post and notification streaming are all broken too, so at least it's consistently broken.



  • @anotherusername Better than Discoursistently broken.



  • @Zerosquare said in WTF Bites:

    Facebook admits using two-factor phone numbers to target ads

    Took them long enough…



  • @DCoder said in WTF Bites:

    Facebook is in the news for being shit again…

    Adding fuel to the fire:

    EFF: Facebook Data Breach Affects At Least 50 Million Users

    NYT: Facebook Security Breach Exposes Accounts of 50 Million Users

    Our investigation is still in its early stages. But it’s clear that attackers exploited a vulnerability in Facebook’s code that impacted “View As” a feature that lets people see what their own profile looks like to someone else. This allowed them to steal Facebook access tokens which they could then use to take over people’s accounts.

    First: View As is a privacy feature that lets people see what their own profile looks like to someone else. View As should be a view-only interface. However, for one type of composer (the box that lets you post content to Facebook) — specifically the version that enables people to wish their friends happy birthday — View As incorrectly provided the opportunity to post a video.

    Second: A new version of our video uploader (the interface that would be presented as a result of the first bug), introduced in July 2017, incorrectly generated an access token that had the permissions of the Facebook mobile app.

    Third: When the video uploader appeared as part of View As, it generated the access token not for you as the viewer, but for the user that you were looking up.



  • ♪ Here come the vultures… ♫

    @DCoder said in WTF Bites:

    Facebook Data Breach Affects At Least 50 Million Users

    Facebook Sued in California Over Hack of 50 Million Accounts

    Facebook Inc. was sued by users of the social network over claims that it negligently allowed hackers to compromise as many as 50 million accounts.

    The class-action complaint was filed Friday in federal court in Northern California within hours of Facebook’s statement saying it has fixed the breach



  • @Gąska said in WTF Bites:

    @ben_lubar your IT administrator really wants you to make this scan.

    0_1538205340586_ff20cf4e-7850-4dd5-b199-31832f80a9c4-image.png


Log in to reply