WTF Bites

wharrgarbl

@Bulb Found a better explanation at

alvinashcraft

Naming Files, Paths, and Namespaces - Win32 apps

All file systems supported by Windows use the concept of files and directories to access data stored on a disk or device.

It seems mostly useless, unless you need to access \\.\COM56 or go beyond the character limit

Bulb

@RaceProUK said in WTF Bites:

It exists because the 260-character limit predates the start of Windows being able to use networked and non-Windows file systems.

How does that make any sense? Lifting a limit is backward compatible. It does not require creating a new kind of input to which different limit will apply.

Kian

@Bulb said in WTF Bites:

How does that make any sense?

Clearly someone's code relied on making a too long path and detecting the failure from a Windows function for some critical control flow. Microsoft could not in good conscience break their code, so everyone else gets to type a magical incantation instead (or deal with broken programs).

Bulb

@Kian Indeed. Likely something to do with a buffer overrun.

hungrier

@Kian Or more likely, they wanted to avoid the possibility of breaking any of the decades worth of existing software that might have a check for "path starts with C:, looks good" in one place and "path isn't too long" in another, which could lead to new errors in unexpected situations, that the developers couldn't fix due to being dead of old age.

Jaloopa

Visual Studio 2017 doesn't allow you to rename publish profiles.

I have one folder type publish profile for each environment my website goes through on the journey to live. These are named FolderProfile, FolderProfile1 etc., and if you don't remember which one's which you have to look at the Configuration option to see what you're publishing.

(Actually, I have renamed them but it involved renaming the actual .pubxml files in Explorer, and removing and readding the renamed files in the Properties part of Solution Explorer, which is a silly workaround when they should just make the name in the create/edit profile window a textbox rather thnan a label)

Tsaukpaetra

@Bulb said in WTF Bites:

@RaceProUK said in WTF Bites:

It exists because the 260-character limit predates the start of Windows being able to use networked and non-Windows file systems.

How does that make any sense? Lifting a limit is backward compatible. It does not require creating a new kind of input to which different limit will apply.

Programs still have that defined in the winapi headers in the Windows SDK as a hard limit. Apparently when allocating buffers for the result this assumption means it's NOT backwards compatible.

cvi

@Tsaukpaetra said in WTF Bites:

@Bulb said in WTF Bites:

@RaceProUK said in WTF Bites:

It exists because the 260-character limit predates the start of Windows being able to use networked and non-Windows file systems.

How does that make any sense? Lifting a limit is backward compatible. It does not require creating a new kind of input to which different limit will apply.

Programs still have that defined in the winapi headers in the Windows SDK as a hard limit. Apparently when allocating buffers for the result this assumption means it's NOT backwards compatible.

Though, a lot of the Win32 functions take a size argument when writing paths to application-owned buffer. And Win32 has a long tradition of appending "Ex" to random function names, and deprecating the old ones (but still keeping them around). Why not do that instead?

Tsaukpaetra

@cvi said in WTF Bites:

@Tsaukpaetra said in WTF Bites:

@Bulb said in WTF Bites:

@RaceProUK said in WTF Bites:

It exists because the 260-character limit predates the start of Windows being able to use networked and non-Windows file systems.

How does that make any sense? Lifting a limit is backward compatible. It does not require creating a new kind of input to which different limit will apply.

Programs still have that defined in the winapi headers in the Windows SDK as a hard limit. Apparently when allocating buffers for the result this assumption means it's NOT backwards compatible.

Though, a lot of the Win32 functions take a size argument when writing paths to application-owned buffer. And Win32 has a long tradition of appending "Ex" to random function names, and deprecating the old ones (but still keeping them around). Why not do that instead?

I'm not Microsoft!

Besides, what would be the expected behavior if I have a file in a too-long-path and the "deprecated" function is called?

cvi

@Tsaukpaetra said in WTF Bites:

Besides, what would be the expected behavior if I have a file in a too-long-path and the "deprecated" function is called?

Most of them already return errors together with GetLastError(). I'm personally a fan of ERROR_OUT_OF_PAPER in this case (and in many other cases). I'm guessing that there is an applicable error code among the thousands of error codes that Win32 defines that could be reused (maybe ERROR_BAD_PATHNAME or something about buffers being too small or whatever).

Besides, the path name can already be too long because somebody created it with the \\?\ syntax, so they must be dealing with that problem somehow right now.

Tsaukpaetra

@cvi said in WTF Bites:

they must be dealing with that problem somehow right now.

Lol yeah. Sure. Dealing with it.

Unless you're Epic Games and just silently fail (or maybe not so silent. The difference isn't truly that different, since the result is the same).

cvi

@Tsaukpaetra So, am I to understand that UE doesn't deal with ERROR_OUT_OF_PAPER gracefully? ;-)

Tsaukpaetra

@cvi said in WTF Bites:

@Tsaukpaetra So, am I to understand that UE doesn't deal with ERROR_OUT_OF_PAPER gracefully? ;-)

It doesn't deal with E_FILE_NOT_FOUND gracefully!

Bulb

@Tsaukpaetra said in WTF Bites:

@Bulb said in WTF Bites:

@RaceProUK said in WTF Bites:

It exists because the 260-character limit predates the start of Windows being able to use networked and non-Windows file systems.

How does that make any sense? Lifting a limit is backward compatible. It does not require creating a new kind of input to which different limit will apply.

Programs still have that defined in the winapi headers in the Windows SDK as a hard limit. Apparently when allocating buffers for the result this assumption means it's NOT backwards compatible.

Well, that's two things.

For functions that just take the path, like CreateFile and FindFirstFile, which are the most common, it would be backward compatible, because the path is just read from buffer the application already allocated.
For functions that return path, but do take length of the buffer, it also is, because passing the length was always required, so the old applications do; and there is already error for path too long—which might occur where it couldn't before, but this depends on the fact files with long paths can exist independent on how creating them is implemented, so the \\?\-hack does not help compared to just lifting the limit.
For FindFirstFile/FindNextFile the limit on path component was not lifted, so the WIN32_FIND_DATA struct still works.
Is there even any function that returns path and expects MAX_PATH long buffer?

(Our four …no … … I'll come in again.)

Apparently Microsoft did have fear of something breaking, or indication of some odd breakage in something, but I still don't see what it could be.

Tsaukpaetra

@Bulb Unreal Engine will fail to cook assets for the game if the destination path for cooked assets exceeds MAX_PATH, and the only way you know this is that it explicitly checks for this when cookng.

To try this theory out, I commented out the check, and lo and behold, it silently fails to save the file. Despite that, in theory, it could crawl down to the directory in question and save it just fine (after all, you can create the file normally in other programs, so it should be fine, right? Nope.).

Bulb

@Tsaukpaetra But that is actual breaking with the current hack. What I don't see is what would break if they did it by simply lifting the limit that is not broken with the hack.

blek

@blek Guess what, I just started getting their stupid e-mails again. FFFFFFUUUUUUUUU-

dkf

@Bulb said in WTF Bites:

What I don't see is what would break if they did it by simply lifting the limit that is not broken with the hack.

Look for structures that are usually declared on the stack and that contain a buffer whose size is set. Those are the big problems, as the runtime OS isn't in control of the size (it's fixed by the ABI at time of compilation).

Bulb

@dkf But we are not talking about changing any of those! We are talking about:

functions that get data from user accepting more data than before and
functions that return data to user-provided buffer of which they do know the size to fill in more data than before if the user-specified size is sufficient.
if (because I think they don't exist) there is a function that expects buffer of fixed size, providing an Ex version that gets size of the target buffer.

Bulb

XML specification:

NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
NameChar ::= NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]
Name ::= NameStartChar (NameChar)*

So what we have here. Hm, U+200C ZERO WIDTH NON-JOINER and U+200D ZERO WIDTH JOINER. As part of an identifier? Even at the start of it? Explicitly permitted? Seriously?

The presentation forms (starting at U+FDF0) don't really sound like a good idea either.

There is a “suggestion” that the names should be in composed normal form, but the appendix J is non-normative. And string match, which is normative, requires exact match including form.

anotherusername

@Bulb said in WTF Bites:

XML specification:

NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
NameChar ::= NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]
Name ::= NameStartChar (NameChar)*

So what we have here. Hm, U+200C ZERO WIDTH NON-JOINER and U+200D ZERO WIDTH JOINER. As part of an identifier? Even at the start of it? Explicitly permitted? Seriously?

The ZWNJ and ZWJ characters weren't just meant to be invisible gotchas. They're necessary to make characters render properly in ligature or non-ligature form in some languages. For identifiers to be written properly in those languages, they're required.

wharrgarbl

@Bulb said in WTF Bites:

How does that make any sense? Lifting a limit is backward compatible. It does not require creating a new kind of input to which different limit will apply.

It's more than that, with that prefix the api is supposed to send your string raw to the filesystem, without doing something to it. (I dunno what something is, but includes the "." and ".." expansion.

wharrgarbl

A bunch of coworkers started using @-mention syntax in email :annoyed:

Tsaukpaetra

@wharrgarbl said in WTF Bites:

A bunch of coworkers started using @-mention syntax in email :annoyed:

It's not syntax if the system it's used in doesn't parse it. :P

anotherusername

@Tsaukpaetra said in WTF Bites:

@wharrgarbl said in WTF Bites:

A bunch of coworkers started using @-mention syntax in email :annoyed:

It's not syntax if the system it's used in doesn't parse it. :P

Ultimately I think the system of the reader's brain will parse it correctly, even if nothing does any special parsing of it prior to that.

anonymous234

Most of the Python standard library is pretty good, but every now and then you find some shockingly bad omissions.

subprocess — Subprocess management

Source code: Lib/subprocess.py The subprocess module allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes. This module intends to replace seve...

The subprocess module is meant to be THE way to spawn new processes and communicate with them. It was introduced in version 2.4 to replace the half-dozen other ways that existed before. So it's designed to be very flexible to be able to handle every use case.

Sadly it doesn't seem to be handle my use case: run a process and get its output in real time.

No, seriously.

The "main" interface is the function run(), which waits until the process completes before returning. So clearly we can't use that.

The documentation says to use Popen() for all purposes that run() can't handle. So yes, you can launch a background process and establish a pipe to the python program with that

p = subprocess.Popen("./processname", stdout=subprocess.PIPE)

But how do you read that output?

The available function is p.communicate(), which waits until the process completes before returning!

So the only remaining thing is p.stdout.read(), which there's a big scary warning telling you not to use

Why even include functions but then claim it's unsafe to use them? If they are always unsafe, don't include them in the documentation, otherwise explain how to use them safely.

As far as I can tell, it's OK to use it here since I didn't redirect stderr so it can't fill up. Even then, getting Python to just read what's available without blocking is not trivial (you have to set bufsize=0 and call read() with any positive integer).

izzion

@Tsaukpaetra
Given that Outlook-current does parse @mentions...

Tsaukpaetra

@izzion said in WTF Bites:

@Tsaukpaetra
Given that Outlook-current does parse @mentions...

WAT. Newfangled wharblegarbles!

Dreikin

@Tsaukpaetra said in WTF Bites:

@izzion said in WTF Bites:

@Tsaukpaetra
Given that Outlook-current does parse @mentions...

WAT. Newfangled @wharrgarbl‍s!

FTFY.

dkf

@anonymous234 said in WTF Bites:

If they are always unsafe

They aren't always unsafe, but you have to use them in the right way and many people don't. There are lots of other gotchas too (e.g., most programs change their buffering according to if they are writing to a terminal or pipe). Monodirectional pipelines — where you either only write into the front or only read from the end — are much easier to work with, yet are also less powerful.

Medinoc

@dkf said in WTF Bites:

(e.g., most programs change their buffering according to if they are writing to a terminal or pipe). Monodirectional pipelines — where you either only write into the front or only read from the end — are much easier to work with, yet are also less powerful.

In fact, the standard C output stream on Windows does this: It's not line-buffered when writing to a pipe (unless the callee program explicitly does something about this) so you can't just spawn any vanilla program and expect to exchange command lines and responses with it (you'll end up waiting for an output that will never come, being stuck in the buffer).

I was pretty cross when I first experienced this problem.

dkf

@Medinoc said in WTF Bites:

@dkf said in WTF Bites:

(e.g., most programs change their buffering according to if they are writing to a terminal or pipe). Monodirectional pipelines — where you either only write into the front or only read from the end — are much easier to work with, yet are also less powerful.

In fact, the standard C output stream on Windows does this

Not just on Windows. It's pretty common on all platforms, and actually substantially accelerates pipeline processing by greatly cutting the number of system calls, at least in the most common cases.

I was pretty cross when I first experienced this problem.

So was I. You need a tool like Expect to work around it. That works a bit differently on various platforms — on real POSIX systems it uses virtual terminals (a pretty finite resource) to do the tricky bits, and on Windows I think it uses the debugging API (this is one of the relatively small number of areas where POSIX is much easier at the API level in a truly non-trivial way) — but it basically tricks programs into working in interactive mode.

PleegWat

@anonymous234 said in WTF Bites:

As far as I can tell, it's OK to use it here since I didn't redirect stderr so it can't fill up. Even then, getting Python to just read what's available without blocking is not trivial (you have to set bufsize=0 and call read() with any positive integer).

Wait a minute, popen() opens multiple pipes?

In every language I worked with, popen() establishes only one pipe (either writing to the other process's stdin, or reading from the other process's stdout). The other descriptor and stderr are inherited. This limits you to one-way communication, but suffices for many usecases.

When there are multiple pipes to the same process, you do indeed need nonblocking I/O with select() or similar to prevent deadlocks.

That communicate() function sounds weird though. I guess callbacks are involved?

anonymous234

@PleegWat said in WTF Bites:

Wait a minute, popen() opens multiple pipes?
In every language I worked with, popen() establishes only one pipe (either writing to the other process's stdin, or reading from the other process's stdout). The other descriptor and stderr are inherited. This limits you to one-way communication, but suffices for many usecases.

You can specify to open a pipe for stdin, stdout and/or stderr. So anywhere between 0 and 3 pipes. By default it doesn't open any.

Communicate() supposedly reads from stdout and stderr while writing to stdin simultaneously to prevent deadlocks (though obviously only for the pipes you opened).

Internally, it seems to create a reader thread for each open pipe, then waits for the threads to finish or for a timeout to expire.

cvi

The new mass effect multiplayer has apparent some balance issues ... and bugs. For example, this:

Right now (patch 1.05) the host's single player difficulty setting determines the base combo damage for all players present.

BrisingrAerowing

@cvi

How the fuck does THAT not get noticed in testing?

Did they even DO any testing?

Yamikuronue

@BrisingrAerowing said in WTF Bites:

How the fuck does THAT not get noticed in testing?

Clean installs with no single-player data.

BrisingrAerowing

@Yamikuronue OK, makes sense.

One would think that testing with both kinds of data would be performed, but...

Yamikuronue

@BrisingrAerowing You'd think....

boomzilla

If you're going to use spaces to indent: USE MORE THAN ONE PER LEVEL.

dkf

@boomzilla In fact, use more than two. Really.

boomzilla

@dkf Well, I actually like using 2 spaces with SQL. But with code just use tabs like a sane person.

RaceProUK

@boomzilla said in WTF Bites:

I actually like using 2 spaces with SQL

0_1494248160747_upload-e3f1b50a-9af7-473d-80bc-23c45f343c5d

I use 10 spaces with SQL; anything less just looks weird.

anonymous234

@boomzilla Four shall be the number of spaces thou shalt add, and the number of the spaces shall be four. Eight is right out.

Onyx

If we could all just use tabs so I can just do :set tabstop=7 if it strikes me fancy without getting into anybody's way...

coderpatsy

@Onyx said in WTF Bites:

If we could all just use tabs so I can just do :set tabstop=7 if it strikes me fancy without getting into anybody's way...

Idea: a setting like that that applies to spaces too. Like the actual line will still contain the appropriate whitespace per line, but it's just displayed as 7 spaces wide (or whatever you set it at).

Probably would invite all kinds of Fun though...

TimeBandit

@Onyx said in WTF Bites:

If we could all just use tabs

https://www.youtube.com/watch?v=siaxGjttoVM

LB_

@coderpatsy it is non-trivial to differentiate indentation from alignment. That's why humans need to first indent with tabs and then align with spaces.

Dreikin

@Onyx said in WTF Bites:

If we could all just use tabs so I can just do :set tabstop=7 if it strikes me fancy without getting into anybody's way...

The language I'm building in my head only allows tabs for indenting, followed by spaces for alignment only in certain circumstances (line continuations, mostly).

cvi

@anonymous234 said in WTF Bites:

@boomzilla Four shall be the number of spaces thou shalt add, and the number of the spaces shall be four. Eight is right out.

Except in Fortran. Column 8 is where the code starts.

Related: I stumbled across this today:

...
C
  170 RETURN
C     ********** LAST CARD OF GMCEVS **********
      END SUBROUTINE

Yes, this code runs in 2017. But not in 2018 if I can help it.