WTF Bites


  • BINNED

    @ben_lubar said in WTF Bites:

    @dkf said in WTF Bites:

    The result is quite efficient

    I dunno, I could probably make this more efficient.

    01229841-5a39-4733-b566-ca0565d879c2-image.png

    Looks like the compiler (conformingly) assumes the uint32_t* storage is actually a valid pointer to uint32_t, i.e. 4 byte aligned. The if branch is assumed to never be taken or equivalent to the else branch anyway. The bug is in the function signature.




  • BINNED

    @Zerosquare said in WTF Bites:

    Where’s Blakey when you need him to tell us how we dirty Yuropoorans are just suing Google because we haven’t come up with a Google ourselves?!

    As we clearly state each time you open a new incognito tab, websites might be able to collect information about your browsing activity.

    So they think that what their browser (which is clearly the only browser there is) says about web sites is an excuse for what their web sites / servers are doing? Bold.


  • Discourse touched me in a no-no place

    @topspin said in WTF Bites:

    So they think that what their browser (which is clearly the only browser there is) says about web sites is an excuse for what their web sites / servers are doing? Bold.

    Sure but how is a Google (or any other data mining) service to know that someone's even in Private™ mode?

    Don't get me wrong, I'm not defending Google and their data hoovering, but picking specifically on the Private™ browsing scenario and complaining it's not actually private is a bit stupid.
    All it does is stop the browser tracking it, as also stated in superiorless terrible browsers.


  • :belt_onion:

    Holy Shit! RealPlayer is still a thing

    realplayer.png


  • Discourse touched me in a no-no place

    @El_Heffe said in WTF Bites:

    RealPlayer



  • @loopback0 When you miss the days where you had to wait 20 minutes of buffering before you could watch that 30 seconds video 🍹


  • Discourse touched me in a no-no place

    @bobjanova said in WTF Bites:

    Surely TRWTF with that is that "storage[0] = value" doesn't do what you expect just because storage isn't aligned. If that's not a valid uint32* then it shouldn't be possible to use that value in the first place. (What does it do, by the way?)

    The problem is that there's several different notions of validity and at least GCC fucks things up sometimes. However, while experimenting with this further today I found a better solution:

    static void set_value(void *storage, uint32_t index, uint32_t value) {
        uint32_t *ptr = __builtin_assume_aligned(storage, 4, 2);
        ptr[index] = value;
    }
    

    That works (the declared type of storage isn't important here, FWIW) and uses two half-word aligned instructions to do the write. It in fact generates better code than just slapping the packed attribute on an underlying structure, which tends to force all accesses to be byte-oriented (on ARM of the revisions I'm dealing with; x86 and ia64 have always had the extra hardware to do unaligned accesses in a transparent way) and that's just horrible for our architecture. Also adding a suitable aligned attribute to the structure fixes the problem. Alas, the damn message format I'm dealing with doesn't really describe as a structure in the first place (it has about 16 sub-formats each with slightly different alignment requirements, but I can at least guarantee that everything is half-word aligned).

    It's all quite ghastly (and of course the protocol concerned has a cutesy annoying name) but now I can make it ghastly and short. Chiselling a few bytes out of our TEXT section is a very big deal for us, given how close we push to the edge of what the hardware can do. (Before anyone asks, we've got a lot more space for data, and we can parallelize to deal with running out of that.)


  • Discourse touched me in a no-no place

    @topspin said in WTF Bites:

    The bug is in the function signature.

    The problem is that there's no way to express the alignment constraints known about the type. If it was possible to do that, it'd be an ideal way of handling this. Instead, gcc makes alignment into this sort of weird adjunct to the values themselves, guided by the type but in a way that it's hard to do good things with. I was pleased to find __builtin_assume_aligned() today though. It's an idiotic way of doing it, but it actually works…


  • BINNED

    @dkf would memcpy work (produce the desired assembly)? Yeah, I know that’s horribly slow, but the compiler should know to optimize that out and just use it as a (the only?) official way to this. If you use an int16_t* storage for the destination of the 4 byte write it should know both the actual alignment and size. Not that it’d be shorter, anyway.


  • Discourse touched me in a no-no place

    @topspin said in WTF Bites:

    would memcpy work (produce the desired assembly)?

    No, alas. The problem is (in the real case at least) that it gets confused by the whole thing and falls back on runtime probing of the alignment and a bytewise copy, which is not inlined at all either (I have no idea why). It might do better with gcc 9, but many of our users won't be using that yet. The result ends up imposing quite a performance overhead and a size overhead too that's unacceptable for us given how little headroom we've got left.

    Part of the problem is that the implementation of memcpy() in question seems to be written for slightly different use cases. It does a fantastic job when accesses are aligned and everything is happy, generating very short and fast instruction sequences that I would never have thought of, but the unaligned case is nothing like as good. It works, of course, but the way it does it isn't brilliant at all.



  • @dkf said in WTF Bites:

    it isn't brilliant at all.

    Would you say it's brillant?


  • Fake News

    @HardwareGeek I would imagine it's not, because he said it did work, unlike the Paula Bean...



  • @JBert Valid point.



  • @TimeBandit said in WTF Bites:

    @loopback0 When you miss the days where you had to wait 20 minutes of buffering before you could watch that 30 seconds video 🍹

    3a642f62-b5fa-4d58-b0f6-43286222b88e-image.png


  • Notification Spam Recipient

    Status: This feels so wrong but it's the only way I've been able to get the fucking .Net Base64 decoder to accept my strings...

    string datastring = s[1];
    datastring += "==".Substring(0, datastring.Length % 4 % 2);
    datastring += "==".Substring(0, datastring.Length % 4);
    string result = System.Text.Encoding.UTF8.GetString(System.Convert.FromBase64String(datastring));
    

    I don't know wtf is going on but hey, it works?

    Edit: Wait, no it doesn't.

    For fuck's sake...

    Edit 2: This one works?

    Mind slowly exploding. Shirley there must be a better way to do this...



  • @topspin said in WTF Bites:

    @ben_lubar said in WTF Bites:

    @dkf said in WTF Bites:

    The result is quite efficient

    I dunno, I could probably make this more efficient.

    01229841-5a39-4733-b566-ca0565d879c2-image.png

    Looks like the compiler (conformingly) assumes the uint32_t* storage is actually a valid pointer to uint32_t, i.e. 4 byte aligned. The if branch is assumed to never be taken or equivalent to the else branch anyway. The bug is in the function signature.

    Nope, x86 allows unaligned stores



  • @topspin said in WTF Bites:

    @Zerosquare said in WTF Bites:

    Where’s Blakey when you need him to tell us how we dirty Yuropoorans are just suing Google because we haven’t come up with a Google ourselves?!

    As we clearly state each time you open a new incognito tab, websites might be able to collect information about your browsing activity.

    So they think that what their browser (which is clearly the only browser there is) says about web sites is an excuse for what their web sites / servers are doing? Bold.

    Their browser also sends a unique 60-or-so character base64 blob to their servers with every single request.


  • Java Dev

    @dkf said in WTF Bites:

    @topspin said in WTF Bites:

    The bug is in the function signature.

    I was pleased to find __builtin_assume_aligned() today though. It's an idiotic way of doing it, but it actually works…

    For a function argument, __attribute__((aligned(2))) would make more sense to me. Guess that doesn't work if the pointer value arises inside a function though.


  • Discourse touched me in a no-no place

    @PleegWat said in WTF Bites:

    @dkf said in WTF Bites:

    @topspin said in WTF Bites:

    The bug is in the function signature.

    I was pleased to find __builtin_assume_aligned() today though. It's an idiotic way of doing it, but it actually works…

    For a function argument, __attribute__((aligned(2))) would make more sense to me. Guess that doesn't work if the pointer value arises inside a function though.

    Worse. It's actually a global variable, and the accumulation of bits to send is done in callbacks from various interrupts. Attaching the alignment to the type would have been perfect. (I did try using a uint16_t* as the argument and just casting to uint32_t* inside the writer function. No dice without the magic incantation; it instead tried to propagate the overalignment assumption outwards, which was precisely backwards.)


  • Java Dev

    @dkf Checking: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#Common-Variable-Attributes

    GCC does have an __attribute__((aligned)) on parameters and variables, but it does something completely different; it instructs how the variable itself should be aligned rather than generating assumptions about pointer targets.

    It also has the same attribute on types, which sounds closer to what you're looking for: https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#Common-Type-Attributes



  • @ben_lubar Original code was for ARM, though.

    FWIW, I did try a using memcpy() in the original compiler explorer link. Results for x86 are as expected: a single mov, since GCC (and most other compilers) very aggressively inline memcpy() (which is more like an intrinsic anyway, at least for x86). They don't do that on ARM for some reason -- I couldn't make it inline the copy even after playing around for a while.

    Compiler explorer link. But, you're right, the x86 code has this weird jump that doesn't do anything. Compiler bug?


  • Discourse touched me in a no-no place

    @PleegWat said in WTF Bites:

    @dkf Checking: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#Common-Variable-Attributes

    GCC does have an __attribute__((aligned)) on parameters and variables, but it does something completely different; it instructs how the variable itself should be aligned rather than generating assumptions about pointer targets.

    It also has the same attribute on types, which sounds closer to what you're looking for: https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#Common-Type-Attributes

    As far as I can tell, that attribute is useful on structs — indeed, you probably want to specify it if you also specify packed so that you can ensure that the structure itself is aligned (unless you're really writing things at random byte offsets) — but useless on pointers to integer types as the compiler just ignores it and goes on to generate code that will crash.

    It's a wonderfully nasty edge case!



  • register.png


  • Banned

    @jinpa defeats 99.99999% of spambots, and with only 2-3 registrations per year doesn't put too much workload on the admin. I think it's quite genius!



  • @Gąska said in WTF Bites:

    @jinpa defeats 99.99999% of spambots, and with only 2-3 registrations per year doesn't put too much workload on the admin. I think it's quite genius!

    There are some here who might think that sending passwords in plaintext is not the best practice, but as long as you don't re-use passwords (which you shouldn't be doing anyway), then the risk is fairly low.

    But I think your exclamation point suggests you already knew that.

    On a tangent, so far it seems to be a pretty good independent distro.


  • Banned

    @jinpa said in WTF Bites:

    @Gąska said in WTF Bites:

    @jinpa defeats 99.99999% of spambots, and with only 2-3 registrations per year doesn't put too much workload on the admin. I think it's quite genius!

    There are some here who might think that sending passwords in plaintext is not the best practice, but as long as you don't re-use passwords (which you shouldn't be doing anyway), then the risk is fairly low.

    The best practices usually denote a way higher standard than actually practical (see: all the overengineered singleton patterns that have functionality equivalent to a simple global variable).

    For an attacker to be able to intercept the password, they need to:

    • Know that the password is being sent to that email in the first place.
    • Intercept the email somehow, which requires knowing not just the destination, but also the source and the time of transmission. Or an always-on listener, which would be a bold strategy, Cotton.
    • Decrypt the email because even though at application layer it's plaintext, at transport layer it's encrypted.

    I can see how a person registering on PCLinuxOS-Forums might be worried about this happening to them.

    @error_bot xkcd security


  • 🔀



  • @Gąska said in WTF Bites:

    I can see how a person registering on PCLinuxOS-Forums might be worried about this happening to them.

    The quality of reasoning we've all come to know and love at TDWTF forums.


  • Banned

    @jinpa am I wrong?



  • @Gąska said in WTF Bites:

    @jinpa am I wrong?

    No, but it was misleading specificity. (cf. "People who run marathons are dying every day!")



  • @cvi said in WTF Bites:

    @ben_lubar Original code was for ARM, though.

    FWIW, I did try a using memcpy() in the original compiler explorer link. Results for x86 are as expected: a single mov, since GCC (and most other compilers) very aggressively inline memcpy() (which is more like an intrinsic anyway, at least for x86). They don't do that on ARM for some reason -- I couldn't make it inline the copy even after playing around for a while.

    Compiler explorer link. But, you're right, the x86 code has this weird jump that doesn't do anything. Compiler bug?

    It might be a bug, but it doesn't do nothing. There's a good chance that it will occasionally, on some CPUs, stall the CPU pipeline because the branch prediction fails. If you're really unlucky, you'll be on the first call in a while, and the ret at L2 will be in a different page whose mapping isn't in the TLB, and then the CPU will stall hard for a long, long while, whereas the mapping for the other ret is in the TLB and you're OK.

    The other thing to observe is that for x64 they are using GCC10, and for ARM it's only GCC7. That might make a difference.


  • Banned

    @jinpa said in WTF Bites:

    @Gąska said in WTF Bites:

    @jinpa am I wrong?

    No, but it was misleading specificity. (cf. "People who run marathons are dying every day!")

    You're right. Just "Linux" would be enough.



  • @Steve_The_Cynic Sorry, that was my mistake in selecting GCC 10. GCC 7 on x64 actually does look different, in that it looks to optimize the first version less.

    What you're saying about the differences is of course correct (and a different count of instructions almost never does "nothing", at least if you're measuring time), but other than performance, I would say the first code (with the extra jump) is as-is with the second code from the optimizer's point of view. The C code certainly doesn't express any intent in that the second branch should stall (which is also very unlikely to happen, as it relies on very precise layout when linking, of which you really have very little control).

    Main point still stands, though: on x64, memcpy() is fully inlined and reduced to a single mov, whereas on ARM it's not. (Clang does inline and reduce the memcpy() to a single str. Apparently that's OK nowadays w.r.t. alignment (ARM v7)? Brief search just mentions that the CPU internally might end up doing two writes somehow.)

    Edit: Actually, @ben_lubar also used GCC 10 for some reason. 🤷



  • @Gąska said in WTF Bites:

    @jinpa said in WTF Bites:

    @Gąska said in WTF Bites:

    @jinpa am I wrong?

    No, but it was misleading specificity. (cf. "People who run marathons are dying every day!")

    You're right. Just "Linux" would be enough.

    I was guessing at the wrong end of the spectrum. I thought maybe you were a Gentoo guy. (I really should keep a spreadsheet on the denizens.)


  • Banned

    @jinpa the only thing I know about Gentoo is that you can install it in just 3 commands.

    cfdisk /dev/hda && mkfs.xfs /dev/hda1 && mount /dev/hda1 /mnt/gentoo/ && chroot /mnt/gentoo/ && env-update && . /etc/profile && emerge sync && cd /usr/portage && scripts/bootsrap.sh && emerge system && emerge vim && vi /etc/fstab && emerge gentoo-dev-sources && cd /usr/src/linux && make menuconfig && make install modules_install && emerge gnome mozilla-firefox openoffice && emerge grub && cp /boot/grub/grub.conf.sample /boot/grub/grub.conf && vi /boot/grub/grub.conf && grub && init 6

    That's the first one.


  • Discourse touched me in a no-no place

    @cvi said in WTF Bites:

    on x64, memcpy() is fully inlined and reduced to a single mov, whereas on ARM it's not. (Clang does inline and reduce the memcpy() to a single str. Apparently that's OK nowadays w.r.t. alignment (ARM v7)? Brief search just mentions that the CPU internally might end up doing two writes somehow.)

    Yeah, the newer ARM cores have the hardware to handle misaligned writes for you. They still actually issue aligned reads and writes (just as x86 and ia64 do; memory hardware remains very strictly aligned) so you might want to get things aligned anyway for performance reasons, but at least they don't crash. I'm not blessed with that; our custom additions don't handle that but instead add other things of more relevance to our application doman (such as the interchip interconnect).



  • @Gąska said in WTF Bites:

    @jinpa the only thing I know about Gentoo is that you can install it in just 3 commands.

    Here, educate yourself:



  • @SirTwist said in WTF Bites:

    @MrL Not my fault, that was in the original.

    Honestly: if you ever hold a team meeting for deciding on a coding style (and that's really appropriate), enforce some "typical" style which is supported out of the box by some style checker tool (e.g. ReSharper). Only then will the guys see that a new era of software development has started.



  • @loopback0 said in WTF Bites:

    @topspin said in WTF Bites:

    So they think that what their browser (which is clearly the only browser there is) says about web sites is an excuse for what their web sites / servers are doing? Bold.

    Sure but how is a Google (or any other data mining) service to know that someone's even in Private™ mode?

    Don't get me wrong, I'm not defending Google and their data hoovering, but picking specifically on the Private™ browsing scenario and complaining it's not actually private is a bit stupid.
    All it does is stop the browser tracking it, as also stated in superiorless terrible browsers.

    Yes, exactly this. If Google's lawyers can't get this laughed out of court on that point alone, they need better lawyers.


  • Banned

    @Zerosquare said in WTF Bites:

    @Gąska said in WTF Bites:

    @jinpa the only thing I know about Gentoo is that you can install it in just 3 commands.

    Here, educate yourself:

    "The performance gained by CFLAGS on x86 is minimal at best -- largely because the machines are still basically overclocked 386's at their core."

    OMG I'M DYING 🤣 🤣 🤣 🤣 🤣



  • @cvi said in WTF Bites:

    @Steve_The_Cynic Sorry, that was my mistake in selecting GCC 10. GCC 7 on x64 actually does look different, in that it looks to optimize the first version less.

    What you're saying about the differences is of course correct (and a different count of instructions almost never does "nothing", at least if you're measuring time), but other than performance, I would say the first code (with the extra jump) is as-is with the second code from the optimizer's point of view. The C code certainly doesn't express any intent in that the second branch should stall (which is also very unlikely to happen, as it relies on very precise layout when linking, of which you really have very little control).

    The pipeline stall I mentioned is mostly applicable to older x86 CPUs that probably aren't x64 anyway, where the branch predictor says it thinks we're going left but in fact we go right, so the path we erroneously guessed and did partial speculative execution on has to be discarded and the pipeline refilled from the correct path. It's less relevant in newer CPUs because the branch prediction is better and they are probably capable of speculating down both paths at once anyway.

    The TLB-miss stall is a nasty that can affect any CPU with paging on, and it's even worse on olde-skoole MIPS CPUs that rely on exceptioning into the OS to handle TLB misses. It does require that the two ret instructions are in different pages, so it's not hugely likely.

    But yes, you're right, there's no semantic difference (which is good, since the optimiser shouldn't change the semantics of conformant code...).

    Probably it optimises less well (including the actual memcpy call) on ARM because the compiler doesn't have enough information to know whether the destination is aligned. On x86/x64, it's rare to run with the AC bit on in the CPU, so an unaligned access doesn't cause exceptions or other nonsense, and doesn't have to have hidden workarounds somewhere.


  • Grade A Premium Asshole

    A user just submitted a ticket because they could not find data from 5-31. We look and the data is there and ask them to please look again. User replies that no, said data is not showing up in their searches. They provide a screenshot to bolster their assertion.

    Their search as performed was for 5-30. User then argues about this point as though their own screenshot did not plainly show the date range they had selected.


  • BINNED

    @Polygeekery said in WTF Bites:

    Their search as performed was for 5-30. User then argues about this point as though their own screenshot did not plainly show the date range they had selected.

    User clearly is a member of this forum. 🍹


  • Grade A Premium Asshole

    @topspin arguing against clear and obvious evidence and trying to blame others because they're a moron?

    That doesn't remind me of anyone around here. Not at all.



  • @Polygeekery said in WTF Bites:

    @topspin arguing against clear and obvious evidence and trying to blame others because they're a moron?

    That doesn't remind me of anyone around here. Not at all.

    Not a single person. Not at all. Not one.



  • @Benjamin-Hall said in WTF Bites:

    Not a single person. Not at all. Not one.

    We're all boomzilla's alt after all



  • @TimeBandit said in WTF Bites:

    @Benjamin-Hall said in WTF Bites:

    Not a single person. Not at all. Not one.

    We're all boomzilla's alt after all

    I am boomzilla and my name is Legion.


  • Banned

    @Carnage I am boomzilla and I approve this message.


  • ♿ (Parody)

    @Gąska I an boomzilla and :kneeling_warthog:


Log in to reply