PHP. That is all.
-
So, I ran into this brillance today. I'm not the first, I'm certainly not the last, but for people who don't deal with PHP or just never ran into this, consider the following code:
<?php $extensions = [ ['Extension' => 123, 'Data' => 'One'], ['Extension' => 456, 'Data' => 'Two'] ]; print_r($extensions); $temp = []; foreach($extensions as $extension) { $temp[$extension['Extension']] = $extension; } $extensions = $temp; print_r($extensions); foreach($extensions as &$extension) { $extension[$extension['Data']] = true; } print_r($extensions);
This outputs:
Array ( [0] => Array ( [Extension] => 123 [Data] => One ) [1] => Array ( [Extension] => 456 [Data] => Two ) ) Array ( [123] => Array ( [Extension] => 123 [Data] => One ) [456] => Array ( [Extension] => 456 [Data] => Two ) ) Array ( [123] => Array ( [Extension] => 123 [Data] => One [One] => 1 ) [456] => Array ( [Extension] => 456 [Data] => Two [Two] => 1 ) )
Okay, this makes sense, right? First, we reindex an existing array/dictionary mutant using one of the values as the new key (temp array to prevent any silly overwrites if they happen to match at some point). Then, we loop through it again using a reference (hence the
&
in front of$extension
in the secondforeach
) instead of a copy of the value so we can manipulate the array directly and add an extra key/value pair.Now, yes, you could write it in a single loop but let's ignore that. The important part is: what happens if you swap those
foreach
loops around?<?php $extensions = [ ['Extension' => 123, 'Data' => 'One'], ['Extension' => 456, 'Data' => 'Two'] ]; print_r($extensions); foreach($extensions as &$extension) { $extension[$extension['Data']] = true; } print_r($extensions); $temp = []; foreach($extensions as $extension) { $temp[$extension['Extension']] = $extension; } $extensions = $temp; print_r($extensions);
Should produce the same results since the loops don't depend on eachother, right? We're ignoring the outer array keys, and we're not reading/writing the same values.
Well...
Array ( [0] => Array ( [Extension] => 123 [Data] => One ) [1] => Array ( [Extension] => 456 [Data] => Two ) ) Array ( [0] => Array ( [Extension] => 123 [Data] => One [One] => 1 ) [1] => Array ( [Extension] => 456 [Data] => Two [Two] => 1 ) ) Array ( [123] => Array ( [Extension] => 123 [Data] => One [One] => 1 ) )
Ummm... why does that last
print_r
only print one element?Because PHP.
Fucking
foreach
sets a fucking internal pointer to the array, which is fine, but if you used a reference instead of copies of value or key => value, it won't get reset by the next fuckingforeach
!The fix, other than swapping the loops around?
<?php $extensions = [ ['Extension' => 123, 'Data' => 'One'], ['Extension' => 456, 'Data' => 'Two'] ]; print_r($extensions); foreach($extensions as &$extension) { $extension[$extension['Data']] = true; } unset($extension); print_r($extensions); $temp = []; foreach($extensions as $extension) { $temp[$extension['Extension']] = $extension; } $extensions = $temp; print_r($extensions);
Yeah. You call
unset
on the reference. Which forces the next loop to reset it. Intuitive!Fucking PHP.
-
-
-
@Onyx The juxtaposition of the username with the situation of reordering and resetting loops
-
@hungrier ...
I'm slow today.
-
@Onyx said in PHP. That is all.:
Fucking foreach sets a fucking internal pointer to the array, which is fine, but if you used a reference instead of copies of value or key => value, it won't get reset by the next fucking foreach!
I saw that coming, only because I've already spent an hour tracking that one down at some point.
-
@Onyx said in PHP. That is all.:
Fucking foreach sets a fucking internal pointer to the array, which is fine, but if you used a reference instead of copies of value or key => value, it won't get reset by the next fucking foreach!
No, it's not about internal pointer. Internal pointer has nothing to do with that, it's another wtf entirely. It's the
$extension
variable, which becomes a reference to the last element, and the next loop does$extension = $extensions[$i]
and overwrites the last element with the value of the current element.And IMO you're attacking the wrong wtf here. This behavior is actually pretty logical, given that in PHP
- variable scope is function-based, not block-based
- arrays are primitives passed by value and copied on assignment
- you can have c++-like references to primitives, where assignment changes the original value in place
Of course all of these are bad ideas, but if you consider them as given, this behavior is the logical consequence. Throwing in an automatic unset before every foreach would introduce even bigger wtfs. For example, what if the array is empty? Should we still unset the variable, or do nothing? Why should semantics of foreach differ from a normal for loop with an assignment inside?
-
@sebastian-galczynski So in other words, you should always add an
unset
call after aforeach
loop?
-
@pie_flavor said in PHP. That is all.:
@sebastian-galczynski So in other words, you should always add an unset call after a foreach loop?
Only after the 'foreach by reference' construct, and when you plan to reuse the same variable name. Which is acceptable, because it appears like once a million LOC. It has almost no use outside of mutating nested arrays, and that smells of bad code anyway. You have type-safe objects for that kind of stuff, and they are passed and assigned always by reference. And if the data in question is small, better use pure functions and array_map().
-
@pie_flavor said in PHP. That is all.:
@sebastian-galczynski So in other words, you should always add an
unset
call after aforeach
loop?
-
@sebastian-galczynski said in PHP. That is all.:
Only after the 'foreach by reference' construct, and when you plan to reuse the same variable name. Which is acceptable, because it appears like once a million LOC.
And starting a new
foreach
construct, despite of variable names, shouldn't reset the pointer? In which freaking use case should it not? Call me stupid, but aforeach
, to me, is equivalent offor($i = 0; $i < count($arr); ++$i)
Which will also let you modify the internals of the array just fine. Or, using iterator semantics, in C++ for example since you mentioned it:
for(auto i = arr.begin(); i != arr.end(); ++i)
I see no reason whatsoever why the initialization to
0
/begin
would not be a thing aforeach
loop does. Hell, people rail against having to manuallybreak
inswitch
statements because they don't want a fallthrough, ever. I think C# doesn't even let you fall through, at all. But I can defend being able to use fallthroughs, the uses are rare and niche, but they exist. I cannot for the life of me imagine a case where I'd typeforeach
and not want each element to be affected.@sebastian-galczynski said in PHP. That is all.:
It has almost no use outside of mutating nested arrays, and that smells of bad code anyway
Well, sorry that's what you get, it was originally JSON, nested stuff is bound to exist, even if I decoded it into nested objects. And I needed to tag some of those in a way that the API which returns the said JSON cannot know. Other than copying over the entire thing bit by bit, modifying nested arrays it is.
@sebastian-galczynski said in PHP. That is all.:
You have type-safe objects for that kind of stuff
In PHP? Enlighten me. If you mean
stdClass
I wouldn't call that type safe any more than an array. And they have their own problems (granted, since PHP 7.2 you can finally access numerically indexed properties in it directly, so it's something). Same shit, IMHO.@sebastian-galczynski said in PHP. That is all.:
array_map()
Fair enough on that one, brainfart when writing the original code.
-
@Onyx said in PHP. That is all.:
and not want each element to be affected.
You're missing it. Each element is affected. Each element is being put into the memory address that the pointer points to. Which happens to be inside the array. It does the
$extension = $extension[0]
thing rather literally.
-
@pie_flavor said in PHP. That is all.:
@Onyx said in PHP. That is all.:
and not want each element to be affected.
You're missing it. Each element is affected. Each element is being put into the memory address that the pointer points to. Which happens to be inside the array. It does the
$extension = $extension[0]
thing rather literally.Yes, it does a thing that you can reason about but that doesn't make it right. It's a completely unexpected, and I'd argue unwanted, behavior. Despite of minutia of how it's accomplished internally, it's still stupid.
-
@Onyx Right. Hence my audio link above.
-
@Onyx said in PHP. That is all.:
Yes, it does a thing that you can reason about but that doesn't make it right. It's a completely unexpected, and I'd argue unwanted, behavior. Despite of minutia of how it's accomplished internally, it's still stupid.
Which is why there is a big red callout on the page describing foreach emphasising this behaviour. The fact is that it is entirely consistent with how both foreach loops and variable references work, so despite the number of times it's been reported as a bug, there's no sign it's going to be changed.
-
@Watson Not to mention that it's already a misfeature that the variable can escape the scope of a foreach or for loop, and that is also something which they are not going to change anytime soon...
-
@JBert 'Course they will. They'll just introduce totally different syntax, like
let $x = 73
or£x = 73
.
-
@JBert said in PHP. That is all.:
@Watson Not to mention that it's already a misfeature that the variable can escape the scope of a foreach or for loop, and that is also something which they are not going to change anytime soon...
That's because loops don't have their own scope. Global and function, that's all there is.
-
@pie_flavor said in PHP. That is all.:
@JBert 'Course they will. They'll just introduce totally different syntax, like
let $x = 73
or£x = 73
.€x = 99.99
...?
-
@PJH said in PHP. That is all.:
@pie_flavor said in PHP. That is all.:
@JBert 'Course they will. They'll just introduce totally different syntax, like
let $x = 73
or£x = 73
.€x = 99.99
...?And of course, code like
$x = £y + €x
will produce a different result everydayminutesecond, depending on the exchange rates...
-
@remi oh neat.
-
@remi said in PHP. That is all.:
@PJH said in PHP. That is all.:
@pie_flavor said in PHP. That is all.:
@JBert 'Course they will. They'll just introduce totally different syntax, like
let $x = 73
or£x = 73
.€x = 99.99
...?And of course, code like
$x = £y + €x
will produce a different result everydayminutesecond, depending on the exchange rates...E_CRITICAL_ERROR: Unit test failure
-
-
@dcon said in PHP. That is all.:
@remi said in PHP. That is all.:
@PJH said in PHP. That is all.:
@pie_flavor said in PHP. That is all.:
@JBert 'Course they will. They'll just introduce totally different syntax, like
let $x = 73
or£x = 73
.€x = 99.99
...?And of course, code like
$x = £y + €x
will produce a different result everydayminutesecond, depending on the exchange rates...E_CRITICAL_ERROR: Unit test failure
At least you have unit tests...
-
@PJH said in PHP. That is all.:
@dcon said in PHP. That is all.:
@remi said in PHP. That is all.:
@PJH said in PHP. That is all.:
@pie_flavor said in PHP. That is all.:
@JBert 'Course they will. They'll just introduce totally different syntax, like
let $x = 73
or£x = 73
.€x = 99.99
...?And of course, code like
$x = £y + €x
will produce a different result everydayminutesecond, depending on the exchange rates...E_CRITICAL_ERROR: Unit test failure
At least you have unit tests...
True, I am making a ridiculous assumption there...
-
@remi said in PHP. That is all.:
@dcon said in "PHP. That is all.":
E_CRITICAL_ERROR: Unit test failure
Do I need to say more?
Yes, more needs to be said. It's fun fact time!
Back in 2011, the release of PHP 5.3.7 introduced a pretty nasty bug in
crypt()
. They had to hastily release 5.3.8 to fix it.
Because there were already ~200 unit tests failing at the time and nobody noticed one more.Having unit tests: good.
Having faulty unit tests: bad.
Ignoring your tests because "some of them always fail" and missing brand new failures: priceless.Nitty gritty SVN committy
The commit that introduced the bug:
r314438Fix more signed 1-bit bitfield, and let's use strlcpy/strlcat instead for these
static string copiesmemcpy(passwd, MD5_MAGIC, MD5_MAGIC_LEN); strlcpy(passwd + MD5_MAGIC_LEN, sp, sl + 1); - strncat(passwd, "$", 1); + strlcat(passwd, "$", 1); PHP_MD5Final(final, &ctx);
The commit that fixed the bug:
r315218Unbreak crypt() (fix bug #55439)
# If you want to remove static analyser messages, be my guest,
# but please run unit tests after/* Now make the output string */ memcpy(passwd, MD5_MAGIC, MD5_MAGIC_LEN); strlcpy(passwd + MD5_MAGIC_LEN, sp, sl + 1); - strlcat(passwd, "$", 1); + strcat(passwd, "$"); PHP_MD5Final(final, &ctx);
-
@DCoder I was curious about what
strlcat()
does, so I looked it up and came across this wonderful gem o' on that page:However, one may question the validity of such optimizations, as they defeat the whole purpose of strlcpy() and strlcat(). As a matter of fact, the first version of this manual page got it wrong.
(words theirs, emphasis mine)
-
@DCoder said in PHP. That is all.:
Back in 2011, the release of PHP 5.3.7 introduced a pretty nasty bug in crypt().
Semi-related:
-
@PJH said in PHP. That is all.:
@DCoder said in PHP. That is all.:
Back in 2011, the release of PHP 5.3.7 introduced a pretty nasty bug in crypt().
Semi-related:
PHP Senior Level Interview:
"Sort a list of arbitrary strings"That's impossible
"Compare two arbitrary strings"
That's impossible
"Safely access a database"
That's impossible
"Congratulations!"
-
@Gribnit said in PHP. That is all.:
"Sort a list of arbitrary strings"
That's impossible
"Compare two arbitrary strings"
That's impossible
"Safely access a database"
That's impossible
"Congratulations!"
Well actually, , , etc …
In all those cases, you only fail outright if your answer uses
<
or<=>
operators. To compare strings properly, you need to know the encoding of the strings, the expected comparison criteria, and probably the human language of those strings. You need to do it through specialized string comparison mechanisms, and I imagine Unicode has enough quagmires to break these mechanisms in certain cases.
For the first/second cases, the best available answer is probablyCollator
.
For the third case, you wanthash_equals
whenever you're comparing hashes.
-
I've got a fresh WTF. Try running this code:
<?php $filename = "dupa.txt"; file_put_contents($filename, "dupa"); echo filesize($filename)."\n"; file_put_contents($filename, "dupa", FILE_APPEND); echo filesize($filename)."\n";
-
@sebastian-galczynski said in PHP. That is all.:
Try running this code
-
Note: The results of this function are cached. See clearstatcache() for more details.
-
Heh
Note: Because PHP's integer type is signed and many platforms use 32bit integers, some filesystem functions may return unexpected results for files which are larger than 2GB.
-
@Watson How fucking hard would it have been to invalidate the cache in the file_put_contents function?
-
@pie_flavor well, many platforms use 32 or even 64 bit integers, so ymmv.
-
@Gribnit What
does that have to do with anything?
e: fixed
-
@pie_flavor Not a damn thing, but it's as close to a reason as you'll ever get.
-
-
@sebastian-galczynski said in PHP. That is all.:
I've got a fresh WTF. Try running this code:
<?php $filename = "dupa.txt"; file_put_contents($filename, "dupa"); echo filesize($filename)."\n"; file_put_contents($filename, "dupa", FILE_APPEND); echo filesize($filename)."\n";
function filei_put_contents_real($filename, $data, $flags=0, $context=null){ $ret = file_put_contents($filename, $data, $flags, $context); clearstatcache(TRUE, $filename); return $ret; }
HTH, HAND, etc.
-
Doesn't the linux kernel also cache stat data in memory? Is there really significant gain in caching it in the PHP process, given that the language is bytecode interpreted at best?
-
@PleegWat said in PHP. That is all.:
Doesn't the linux kernel also cache stat data in memory? Is there really significant gain in caching it in the PHP process, given that the language is bytecode interpreted at best?
It does, but calling stat() requires a context switch, which is significantly more expensive than a local function call. But still, without invalidation the feature seems completely useless. Why would anyone call stat() and related functions 2 times in a row on the same file, knowing that it didn't change?
-
@PleegWat said in PHP. That is all.:
Doesn't the linux kernel also cache stat data in memory?
Depends on the filesystem. For local ones, it probably does. For network filesystems, it can't.
-
@sebastian-galczynski With such input into the file, PHP thought that you're an ass deserving such a result.
-
@levicki what isn't better than JavaScript?
-
@TimeBandit said in PHP. That is all.:
@levicki what isn't better than JavaScript?
XCode and objective C.
-
@Benjamin-Hall said in PHP. That is all.:
@TimeBandit said in PHP. That is all.:
@levicki what isn't better than JavaScript?
XCode and objective C.
I don't do JavaScript and don't want to. I do do C++ (windows) and like it.
I approve that comment.
-
-
@Gąska said in PHP. That is all.:
@dcon said in PHP. That is all.:
I do do C++ (windows) and like it.
You're officially insane.
Thankyou. Thankyou. Thankyouverymuch.
-