WTF Bites
-
@ben_lubar said in WTF Bites:
The result is quite efficient
I dunno, I could probably make this more efficient.
Looks like the compiler (conformingly) assumes the
uint32_t* storage
is actually a valid pointer touint32_t
, i.e. 4 byte aligned. The if branch is assumed to never be taken or equivalent to the else branch anyway. The bug is in the function signature.
-
-
@Zerosquare said in WTF Bites:
Where’s Blakey when you need him to tell us how we dirty Yuropoorans are just suing Google because we haven’t come up with a Google ourselves?!
As we clearly state each time you open a new incognito tab, websites might be able to collect information about your browsing activity.
So they think that what their browser (which is clearly the only browser there is) says about web sites is an excuse for what their web sites / servers are doing? Bold.
-
So they think that what their browser (which is clearly the only browser there is) says about web sites is an excuse for what their web sites / servers are doing? Bold.
Sure but how is a Google (or any other data mining) service to know that someone's even in Private™ mode?
Don't get me wrong, I'm not defending Google and their data hoovering, but picking specifically on the Private™ browsing scenario and complaining it's not actually private is a bit stupid.
All it does is stop the browser tracking it, as also stated insuperiorless terrible browsers.
-
Holy Shit! RealPlayer is still a thing
-
-
@loopback0 When you miss the days where you had to wait 20 minutes of buffering before you could watch that 30 seconds video
-
@bobjanova said in WTF Bites:
Surely TRWTF with that is that "storage[0] = value" doesn't do what you expect just because storage isn't aligned. If that's not a valid uint32* then it shouldn't be possible to use that value in the first place. (What does it do, by the way?)
The problem is that there's several different notions of validity and at least GCC fucks things up sometimes. However, while experimenting with this further today I found a better solution:
static void set_value(void *storage, uint32_t index, uint32_t value) { uint32_t *ptr = __builtin_assume_aligned(storage, 4, 2); ptr[index] = value; }
That works (the declared type of
storage
isn't important here, FWIW) and uses two half-word aligned instructions to do the write. It in fact generates better code than just slapping thepacked
attribute on an underlying structure, which tends to force all accesses to be byte-oriented (on ARM of the revisions I'm dealing with; x86 and ia64 have always had the extra hardware to do unaligned accesses in a transparent way) and that's just horrible for our architecture. Also adding a suitablealigned
attribute to the structure fixes the problem. Alas, the damn message format I'm dealing with doesn't really describe as a structure in the first place (it has about 16 sub-formats each with slightly different alignment requirements, but I can at least guarantee that everything is half-word aligned).It's all quite ghastly (and of course the protocol concerned has a cutesy annoying name) but now I can make it ghastly and short. Chiselling a few bytes out of our TEXT section is a very big deal for us, given how close we push to the edge of what the hardware can do. (Before anyone asks, we've got a lot more space for data, and we can parallelize to deal with running out of that.)
-
The bug is in the function signature.
The problem is that there's no way to express the alignment constraints known about the type. If it was possible to do that, it'd be an ideal way of handling this. Instead, gcc makes alignment into this sort of weird adjunct to the values themselves, guided by the type but in a way that it's hard to do good things with. I was pleased to find
__builtin_assume_aligned()
today though. It's an idiotic way of doing it, but it actually works…
-
@dkf would
memcpy
work (produce the desired assembly)? Yeah, I know that’s horribly slow, but the compiler should know to optimize that out and just use it as a (the only?) official way to this. If you use anint16_t* storage
for the destination of the 4 byte write it should know both the actual alignment and size. Not that it’d be shorter, anyway.
-
would
memcpy
work (produce the desired assembly)?No, alas. The problem is (in the real case at least) that it gets confused by the whole thing and falls back on runtime probing of the alignment and a bytewise copy, which is not inlined at all either (I have no idea why). It might do better with gcc 9, but many of our users won't be using that yet. The result ends up imposing quite a performance overhead and a size overhead too that's unacceptable for us given how little headroom we've got left.
Part of the problem is that the implementation of
memcpy()
in question seems to be written for slightly different use cases. It does a fantastic job when accesses are aligned and everything is happy, generating very short and fast instruction sequences that I would never have thought of, but the unaligned case is nothing like as good. It works, of course, but the way it does it isn't brilliant at all.
-
-
@HardwareGeek I would imagine it's not, because he said it did work, unlike the Paula Bean...
-
@JBert Valid point.
-
@TimeBandit said in WTF Bites:
@loopback0 When you miss the days where you had to wait 20 minutes of buffering before you could watch that 30 seconds video
-
Status: This feels so wrong but it's the only way I've been able to get the fucking .Net Base64 decoder to accept my strings...
string datastring = s[1]; datastring += "==".Substring(0, datastring.Length % 4 % 2); datastring += "==".Substring(0, datastring.Length % 4); string result = System.Text.Encoding.UTF8.GetString(System.Convert.FromBase64String(datastring));
I don't know wtf is going on but hey, it works?
Edit: Wait, no it doesn't.
For fuck's sake...
Edit 2: This one works?
Mind slowly exploding. Shirley there must be a better way to do this...
-
@ben_lubar said in WTF Bites:
The result is quite efficient
I dunno, I could probably make this more efficient.
Looks like the compiler (conformingly) assumes the
uint32_t* storage
is actually a valid pointer touint32_t
, i.e. 4 byte aligned. The if branch is assumed to never be taken or equivalent to the else branch anyway. The bug is in the function signature.Nope, x86 allows unaligned stores
-
@Zerosquare said in WTF Bites:
Where’s Blakey when you need him to tell us how we dirty Yuropoorans are just suing Google because we haven’t come up with a Google ourselves?!
As we clearly state each time you open a new incognito tab, websites might be able to collect information about your browsing activity.
So they think that what their browser (which is clearly the only browser there is) says about web sites is an excuse for what their web sites / servers are doing? Bold.
Their browser also sends a unique 60-or-so character base64 blob to their servers with every single request.
-
The bug is in the function signature.
I was pleased to find
__builtin_assume_aligned()
today though. It's an idiotic way of doing it, but it actually works…For a function argument,
__attribute__((aligned(2)))
would make more sense to me. Guess that doesn't work if the pointer value arises inside a function though.
-
The bug is in the function signature.
I was pleased to find
__builtin_assume_aligned()
today though. It's an idiotic way of doing it, but it actually works…For a function argument,
__attribute__((aligned(2)))
would make more sense to me. Guess that doesn't work if the pointer value arises inside a function though.Worse. It's actually a global variable, and the accumulation of bits to send is done in callbacks from various interrupts. Attaching the alignment to the type would have been perfect. (I did try using a
uint16_t*
as the argument and just casting touint32_t*
inside the writer function. No dice without the magic incantation; it instead tried to propagate the overalignment assumption outwards, which was precisely backwards.)
-
@dkf Checking: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#Common-Variable-Attributes
GCC does have an
__attribute__((aligned))
on parameters and variables, but it does something completely different; it instructs how the variable itself should be aligned rather than generating assumptions about pointer targets.It also has the same attribute on types, which sounds closer to what you're looking for: https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#Common-Type-Attributes
-
@ben_lubar Original code was for ARM, though.
FWIW, I did try a using
memcpy()
in the original compiler explorer link. Results for x86 are as expected: a singlemov
, since GCC (and most other compilers) very aggressively inlinememcpy()
(which is more like an intrinsic anyway, at least for x86). They don't do that on ARM for some reason -- I couldn't make it inline the copy even after playing around for a while.Compiler explorer link. But, you're right, the x86 code has this weird jump that doesn't do anything. Compiler bug?
-
@dkf Checking: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#Common-Variable-Attributes
GCC does have an
__attribute__((aligned))
on parameters and variables, but it does something completely different; it instructs how the variable itself should be aligned rather than generating assumptions about pointer targets.It also has the same attribute on types, which sounds closer to what you're looking for: https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#Common-Type-Attributes
As far as I can tell, that attribute is useful on structs — indeed, you probably want to specify it if you also specify
packed
so that you can ensure that the structure itself is aligned (unless you're really writing things at random byte offsets) — but useless on pointers to integer types as the compiler just ignores it and goes on to generate code that will crash.It's a wonderfully nasty edge case!
-
-
@jinpa defeats 99.99999% of spambots, and with only 2-3 registrations per year doesn't put too much workload on the admin. I think it's quite genius!
-
@jinpa defeats 99.99999% of spambots, and with only 2-3 registrations per year doesn't put too much workload on the admin. I think it's quite genius!
There are some here who might think that sending passwords in plaintext is not the best practice, but as long as you don't re-use passwords (which you shouldn't be doing anyway), then the risk is fairly low.
But I think your exclamation point suggests you already knew that.
On a tangent, so far it seems to be a pretty good independent distro.
-
@jinpa defeats 99.99999% of spambots, and with only 2-3 registrations per year doesn't put too much workload on the admin. I think it's quite genius!
There are some here who might think that sending passwords in plaintext is not the best practice, but as long as you don't re-use passwords (which you shouldn't be doing anyway), then the risk is fairly low.
The best practices usually denote a way higher standard than actually practical (see: all the overengineered singleton patterns that have functionality equivalent to a simple global variable).
For an attacker to be able to intercept the password, they need to:
- Know that the password is being sent to that email in the first place.
- Intercept the email somehow, which requires knowing not just the destination, but also the source and the time of transmission. Or an always-on listener, which would be a bold strategy, Cotton.
- Decrypt the email because even though at application layer it's plaintext, at transport layer it's encrypted.
I can see how a person registering on PCLinuxOS-Forums might be worried about this happening to them.
@error_bot xkcd security
-
-
I can see how a person registering on PCLinuxOS-Forums might be worried about this happening to them.
The quality of reasoning we've all come to know and love at TDWTF forums.
-
@jinpa am I wrong?
-
@jinpa am I wrong?
No, but it was misleading specificity. (cf. "People who run marathons are dying every day!")
-
@ben_lubar Original code was for ARM, though.
FWIW, I did try a using
memcpy()
in the original compiler explorer link. Results for x86 are as expected: a singlemov
, since GCC (and most other compilers) very aggressively inlinememcpy()
(which is more like an intrinsic anyway, at least for x86). They don't do that on ARM for some reason -- I couldn't make it inline the copy even after playing around for a while.Compiler explorer link. But, you're right, the x86 code has this weird jump that doesn't do anything. Compiler bug?
It might be a bug, but it doesn't do nothing. There's a good chance that it will occasionally, on some CPUs, stall the CPU pipeline because the branch prediction fails. If you're really unlucky, you'll be on the first call in a while, and the
ret
atL2
will be in a different page whose mapping isn't in the TLB, and then the CPU will stall hard for a long, long while, whereas the mapping for the otherret
is in the TLB and you're OK.The other thing to observe is that for x64 they are using GCC10, and for ARM it's only GCC7. That might make a difference.
-
@jinpa am I wrong?
No, but it was misleading specificity. (cf. "People who run marathons are dying every day!")
You're right. Just "Linux" would be enough.
-
@Steve_The_Cynic Sorry, that was my mistake in selecting GCC 10. GCC 7 on x64 actually does look different, in that it looks to optimize the first version less.
What you're saying about the differences is of course correct (and a different count of instructions almost never does "nothing", at least if you're measuring time), but other than performance, I would say the first code (with the extra jump) is as-is with the second code from the optimizer's point of view. The C code certainly doesn't express any intent in that the second branch should stall (which is also very unlikely to happen, as it relies on very precise layout when linking, of which you really have very little control).
Main point still stands, though: on x64,
memcpy()
is fully inlined and reduced to a singlemov
, whereas on ARM it's not. (Clang does inline and reduce thememcpy()
to a singlestr
. Apparently that's OK nowadays w.r.t. alignment (ARM v7)? Brief search just mentions that the CPU internally might end up doing two writes somehow.)Edit: Actually, @ben_lubar also used GCC 10 for some reason.
-
@jinpa am I wrong?
No, but it was misleading specificity. (cf. "People who run marathons are dying every day!")
You're right. Just "Linux" would be enough.
I was guessing at the wrong end of the spectrum. I thought maybe you were a Gentoo guy. (I really should keep a spreadsheet on the denizens.)
-
@jinpa the only thing I know about Gentoo is that you can install it in just 3 commands.
cfdisk /dev/hda && mkfs.xfs /dev/hda1 && mount /dev/hda1 /mnt/gentoo/ && chroot /mnt/gentoo/ && env-update && . /etc/profile && emerge sync && cd /usr/portage && scripts/bootsrap.sh && emerge system && emerge vim && vi /etc/fstab && emerge gentoo-dev-sources && cd /usr/src/linux && make menuconfig && make install modules_install && emerge gnome mozilla-firefox openoffice && emerge grub && cp /boot/grub/grub.conf.sample /boot/grub/grub.conf && vi /boot/grub/grub.conf && grub && init 6
That's the first one.
-
on x64,
memcpy()
is fully inlined and reduced to a singlemov
, whereas on ARM it's not. (Clang does inline and reduce thememcpy()
to a singlestr
. Apparently that's OK nowadays w.r.t. alignment (ARM v7)? Brief search just mentions that the CPU internally might end up doing two writes somehow.)Yeah, the newer ARM cores have the hardware to handle misaligned writes for you. They still actually issue aligned reads and writes (just as x86 and ia64 do; memory hardware remains very strictly aligned) so you might want to get things aligned anyway for performance reasons, but at least they don't crash. I'm not blessed with that; our custom additions don't handle that but instead add other things of more relevance to our application doman (such as the interchip interconnect).
-
@jinpa the only thing I know about Gentoo is that you can install it in just 3 commands.
Here, educate yourself:
-
@MrL Not my fault, that was in the original.
Honestly: if you ever hold a team meeting for deciding on a coding style (and that's really appropriate), enforce some "typical" style which is supported out of the box by some style checker tool (e.g. ReSharper). Only then will the guys see that a new era of software development has started.
-
@loopback0 said in WTF Bites:
So they think that what their browser (which is clearly the only browser there is) says about web sites is an excuse for what their web sites / servers are doing? Bold.
Sure but how is a Google (or any other data mining) service to know that someone's even in Private™ mode?
Don't get me wrong, I'm not defending Google and their data hoovering, but picking specifically on the Private™ browsing scenario and complaining it's not actually private is a bit stupid.
All it does is stop the browser tracking it, as also stated insuperiorless terrible browsers.Yes, exactly this. If Google's lawyers can't get this laughed out of court on that point alone, they need better lawyers.
-
@Zerosquare said in WTF Bites:
@jinpa the only thing I know about Gentoo is that you can install it in just 3 commands.
Here, educate yourself:
"The performance gained by CFLAGS on x86 is minimal at best -- largely because the machines are still basically overclocked 386's at their core."
OMG I'M DYING
-
@Steve_The_Cynic Sorry, that was my mistake in selecting GCC 10. GCC 7 on x64 actually does look different, in that it looks to optimize the first version less.
What you're saying about the differences is of course correct (and a different count of instructions almost never does "nothing", at least if you're measuring time), but other than performance, I would say the first code (with the extra jump) is as-is with the second code from the optimizer's point of view. The C code certainly doesn't express any intent in that the second branch should stall (which is also very unlikely to happen, as it relies on very precise layout when linking, of which you really have very little control).
The pipeline stall I mentioned is mostly applicable to older x86 CPUs that probably aren't x64 anyway, where the branch predictor says it thinks we're going left but in fact we go right, so the path we erroneously guessed and did partial speculative execution on has to be discarded and the pipeline refilled from the correct path. It's less relevant in newer CPUs because the branch prediction is better and they are probably capable of speculating down both paths at once anyway.
The TLB-miss stall is a nasty that can affect any CPU with paging on, and it's even worse on olde-skoole MIPS CPUs that rely on exceptioning into the OS to handle TLB misses. It does require that the two
ret
instructions are in different pages, so it's not hugely likely.But yes, you're right, there's no semantic difference (which is good, since the optimiser shouldn't change the semantics of conformant code...).
Probably it optimises less well (including the actual memcpy call) on ARM because the compiler doesn't have enough information to know whether the destination is aligned. On x86/x64, it's rare to run with the AC bit on in the CPU, so an unaligned access doesn't cause exceptions or other nonsense, and doesn't have to have hidden workarounds somewhere.
-
A user just submitted a ticket because they could not find data from 5-31. We look and the data is there and ask them to please look again. User replies that no, said data is not showing up in their searches. They provide a screenshot to bolster their assertion.
Their search as performed was for 5-30. User then argues about this point as though their own screenshot did not plainly show the date range they had selected.
-
@Polygeekery said in WTF Bites:
Their search as performed was for 5-30. User then argues about this point as though their own screenshot did not plainly show the date range they had selected.
User clearly is a member of this forum.
-
@topspin arguing against clear and obvious evidence and trying to blame others because they're a moron?
That doesn't remind me of anyone around here. Not at all.
-
@Polygeekery said in WTF Bites:
@topspin arguing against clear and obvious evidence and trying to blame others because they're a moron?
That doesn't remind me of anyone around here. Not at all.
Not a single person. Not at all. Not one.
-
@Benjamin-Hall said in WTF Bites:
Not a single person. Not at all. Not one.
We're all boomzilla's alt after all
-
@TimeBandit said in WTF Bites:
@Benjamin-Hall said in WTF Bites:
Not a single person. Not at all. Not one.
We're all boomzilla's alt after all
I am boomzilla and my name is Legion.
-
@Carnage I am boomzilla and I approve this message.
-
@Gąska I an boomzilla and