Big list of software that cannot handle spaces or accents in paths
-
@bulb the official name is UTF-16. The implementation is UCS-2 with surrogates slapped on top. So use UTF-16 if you want established names, or UCS-2 if you want to highlight the difference between it and actual UTF-16 - but please don't make up new words.
-
@gąska said in Big list of software that cannot handle spaces or accents in paths:
@bulb the official name is UTF-16. The implementation is UCS-2 with surrogates slapped on top. So use UTF-16 if you want established names, or UCS-2 if you want to highlight the difference between it and actual UTF-16 - but please don't make up new words.
But it is not either. UTF-16 means lone surrogates are an error, but for backward compati(de)bility reasons they are not and UCS-2 means surrogates are not interpreted at all, but they are.
-
@bulb Microsoft says it's UTF-16. Suck it.
-
@gąska Microsoft is not the one defining what “UTF-16” means.
-
@bulb Microsoft is the one defining what encoding Windows uses. They made some encoding and called it UTF-16, and have the name conflict way up their asses. So Windows absolutely uses UTF-16, even though it's not the same UTF-16 as the one people usually think of when they hear UTF-16.
-
@gąska Ok, then I guess everybody else should be calling it “Microsoft-calls-it-UTF-16”, because they are using the Unicode consortium definition of “UTF-16”.
-
@bulb I'm okay with that.
-
@laoc said in Big list of software that cannot handle spaces or accents in paths:
Yeah, many OSs can't all follow one design.
The problem isn't that POSIX/Linux is a different design, the problem is that it's a bad design that makes it virtually impossible to write 100% correct code.
The only reason anything in that environment works now is because people "just know" to not put carriage returns, "-r" or unprintable characters in filenames. But the system doesn't do anything to prevent self-foot-shooting, and technically any program that can't deal with a carriage return in a filename (which is like... 90% of them) is incorrect.
-
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
@laoc said in Big list of software that cannot handle spaces or accents in paths:
Yeah, many OSs can't all follow one design.
The problem isn't that POSIX/Linux is a different design, the problem is that it's a bad design that makes it virtually impossible to write 100% correct code.
The only reason anything in that environment works now is because people "just know" to not put carriage returns, "-r" or unprintable characters in filenames. But the system doesn't do anything to prevent self-foot-shooting, and technically any program that can't deal with a carriage return in a filename (which is like... 90% of them) is incorrect.
You know the fun? POSIX never ever required these characters to be permitted. It requires something like letters, numbers, period and dash in non-leading position and that's about it (I see it's mentioned in the article, too). It's just that the filesystem authors don't care about exercising their right to forbid the problematic ones.
-
@bulb said in Big list of software that cannot handle spaces or accents in paths:
@gąska Ok, then I guess everybody else should be calling it “Microsoft-calls-it-UTF-16”, because they are using the Unicode consortium definition of “UTF-16”.
I think WTF-16 is a nice abbreviation for “Microsoft-calls-it-UTF-16”. Easy to write and to say, lots of keystrokes and syllables saved, and not that obtuse.
-
@bulb Yup. It's shitty because everybody in the Linux world is a fucking coward.
The funny thing is when people like me thought "oh Ubuntu is going to swoop in and fix all this shit" and they then proceeded to do... literally nothing at all and within 3 years they were just another Red Hat with a slightly better UI theme. Because even they were too cowardly to make any real fixes.
-
@bulb said in Big list of software that cannot handle spaces or accents in paths:
It's just that the filesystem authors don't care about exercising their right to forbid the problematic ones.
Why would they? It's kernel's job to tell the user to go fuck themselves with their unprintable characters.
-
@gąska said in Big list of software that cannot handle spaces or accents in paths:
@bulb Microsoft is the one defining what encoding Windows uses. They made some encoding and called it UTF-16, and have the name conflict way up their asses. So Windows absolutely uses UTF-16, even though it's not the same UTF-16 as the one people usually think of when they hear UTF-16.
I feel that's a very reasonable argument.
But I'm using a definition of reasonable that's not the same as most people would usually think.
-
@gąska said in Big list of software that cannot handle spaces or accents in paths:
It's kernel's job
It's the filesystem implementation's job. (That doesn't need to actually be inside the kernel; it's privileged code, but not very privileged.) Filesystems that care (e.g., an old implementation of FAT16) can indeed say that, but most stick largely to “hey, we just ship these bytes around” and might have a metadata field somewhere in the FS header that says what the meaning of those bytes should be so that the kernel can provide services to help bridge things. (In that case, the kernel itself is acting as library for those FS implementations that are in the kernel memory space; user-space code can use other libraries.)
What I find fascinating about all this is that there are so many developers who say that doing the more complicated thing (case folding, etc.) in the kernel is obviously the right thing, when in practice it isn't because of just how weird human languages are. When someone suggests making the OS simpler and having it do less (leaving all the awkward cases to the GUI) then they get all angry. It's like they have no idea of what a sensible engineering solution is (which “minimise complexity in privileged code” sounds like to me) and so want to thrust all their own bad practices on others…
-
@dkf said in Big list of software that cannot handle spaces or accents in paths:
@gąska said in Big list of software that cannot handle spaces or accents in paths:
It's kernel's job
It's the filesystem implementation's job.
No, it's kernel's job. Or whatever is in between the application and the filesystem (usually it's just kernel). Special characters are very problematic even if the filesystem itself can deal with them just fine - and if we leave it to filesystem implementations, we'll have a different set of rules (due to bugs and personal preferences) for each filesystem within the same operating system, which absolutely sucks. Doing something that is not filesystem driver's problem to filesystem driver is a very bad idea.
@dkf said in Big list of software that cannot handle spaces or accents in paths:
What I find fascinating about all this is that there are so many developers who say that doing the more complicated thing (case folding, etc.) in the kernel is obviously the right thing, when in practice it isn't because of just how weird human languages are. When someone suggests making the OS simpler and having it do less (leaving all the awkward cases to the GUI) then they get all angry.
In a perfect universe where filenames are only ever used to show them in UI, we could have UI handling filenames from start to finish. But someone in 1956 made the stupid decision to make the filename also the unique identifier which everything - from kernel through headless services up to GUI application and all its non-GUI parts - uses to refer to a particular file, and now we're forever stuck with having to deal with filenames at every abstraction level above "a partition is bunch of bytes read and written to the storage device". If there are some characters that the system cannot handle sensibly in filenames, the system should disallow creation of any files with those characters in name. It should do it at the highest level possible where it can check literally every file operation by every single program running - which usually leaves only the kernel.
-
@gąska said in Big list of software that cannot handle spaces or accents in paths:
Why would they? It's kernel's job to tell the user to go fuck themselves with their unprintable characters.
It doesn't matter who does it, it only matters that nobody's done it.
-
@blakeyrat well, not exactly. See, if you support multiple filesystems, and each has its own function to validate filenames, it's not going to end well.
-
@gąska said in Big list of software that cannot handle spaces or accents in paths:
In a perfect universe where filenames are only ever used to show them in UI, we could have UI handling filenames from start to finish. But someone in 1956 made the stupid decision to make the filename also the unique identifier which everything - from kernel through headless services up to GUI application and all its non-GUI parts - uses to refer to a particular file, and now we're forever stuck with having to deal with filenames at every abstraction level above "a partition is bunch of bytes read and written to the storage device".
Another example of shitty design.
Macintosh was designed to refer to files by their identifier. The filename was just yet another piece of meta-data. So you could do reasonable stuff like this:
- Open "foo.doc" in Word
- Rename "foo.doc" to "bar.doc" in Finder
- Make changes and hit Save in Word
- Your changes are saved to "bar.doc" where they belong, instead of Word writing a new "foo.doc" which is wrong and confusing, like what happens in every other shitty OS.
Systems that treat NAMES of things as IDENTIFIERS of those things, especially things that are expected to be renamed often, are the WTF.
-
@gąska said in Big list of software that cannot handle spaces or accents in paths:
@blakeyrat well, not exactly. See, if you support multiple filesystems, and each has its own function to validate filenames, it's not going to end well.
Nobody said writing an OS was going to be easy. Now stop with the excuses, and fix the bullshit.
-
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
Macintosh was designed to refer to files by their identifier.
Wait, really? Suddenly Mac Classic became my favorite operating system, despite never owning any Mac - whether old or new.
-
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
@gąska said in Big list of software that cannot handle spaces or accents in paths:
@blakeyrat well, not exactly. See, if you support multiple filesystems, and each has its own function to validate filenames, it's not going to end well.
Nobody said writing an OS was going to be easy. Now stop with the excuses, and fix the bullshit.
That's not what I meant. I meant that a fix at filesystem level is going to be half-assed no matter how you do it. They should definitely fix it, but doing things a little bit less wrong than before isn't a real fix. It's just a bandaid similar to the "fix" for
***\****
.
-
@gąska said in Big list of software that cannot handle spaces or accents in paths:
if we leave it to filesystem implementations, we'll have a different set of rules (due to bugs and personal preferences) for each filesystem within the same operating system, which absolutely sucks.
Haven't we had discussions on weird problems with files on removable disks because mac os x normalizes? I don't recall the details.
-
@gąska The thing is, the end user doesn't give a shit where it's fixed. The important thing is it's broken now, and no Linux developer has ever had:
- The courage to say "we're making this change, even though it might break some weird broken-ass shit you were using which used filenames as a database or some shit" or
- The dedication to actually research if any Linux installs rely on files with carriage returns in them so as to get enough data to prove conclusively that fixing this wouldn't hurt anything.
So it's either cowardice or laziness.
-
@gąska said in Big list of software that cannot handle spaces or accents in paths:
Wait, really? Suddenly Mac Classic became my favorite operating system, despite never owning any Mac - whether old or new.
Yeah but it all broke down when it started getting software lazily-ported from shittier OSes by developers who didn't bother spending the few hours to learn how Macintosh was designed to work. That was well under way before Apple threw it all away on purpose and make their OS a shitty skin on some shitty Linux-y shit.
-
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
@gąska The thing is, the end user doesn't give a shit where it's fixed.
But the user cares if the fix works. A fix at filesystem level has a high chance of not working, or even worse - not working consistently.
A half-assed fix is better than no fix (usually, YMMV). But it's still half-assed.
-
@gąska Ok sure whatever but the debate is academic since it'll never get fixed.
-
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
Macintosh was designed to refer to files by their identifier. The filename was just yet another piece of meta-data. So you could do reasonable stuff like this:
You may already know this, but the Win32 API allows you to refer to files by their identifier as well.
... just no one does it. :(
-
@heterodox said in Big list of software that cannot handle spaces or accents in paths:
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
Macintosh was designed to refer to files by their identifier. The filename was just yet another piece of meta-data. So you could do reasonable stuff like this:
You may already know this, but the Win32 API allows you to refer to files by their identifier as well.
... just no one does it. :(
I've never seen this. Is this some api in the kernel layer?
-
@mikehurley said in Big list of software that cannot handle spaces or accents in paths:
I've never seen this. Is this some api in the kernel layer?
I think C itself is designed that way, isn't it? If you open a file and keep around the file descriptor, you have an ID to it which is entirely separated from the file's name. It's been ages since I did C.
-
@heterodox said in Big list of software that cannot handle spaces or accents in paths:
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
Macintosh was designed to refer to files by their identifier. The filename was just yet another piece of meta-data. So you could do reasonable stuff like this:
You may already know this, but the Win32 API allows you to refer to files by their identifier as well.
But can I traverse directories using only identifiers? Can I retrieve current working directory as an identifier? Can I receive files from drag'n'drop as identifiers?
-
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
@mikehurley said in Big list of software that cannot handle spaces or accents in paths:
I've never seen this. Is this some api in the kernel layer?
I think C itself is designed that way, isn't it? If you open a file and keep around the file descriptor, you have an ID to it which is entirely separated from the file's name. It's been ages since I did C.
But to open a file and receive file descriptor, I have to provide the name.
-
@gąska said in Big list of software that cannot handle spaces or accents in paths:
But to open a file and receive file descriptor, I have to provide the name.
Right; that wasn't true in Macintosh because the only way to get the file was to use the OS' open dialog or respond to a drag&drop or AppleScript event. And in all those cases, IIRC, you got the file's descriptor and not the path. But it's obviously been ages since I wrote any Macintosh programs.
The idea that a human being would ever see or, even worse, have to type in a path was completely alien to that platform.
(You COULD actually construct a path and open based on that but it was the "wrong way" to do things.)
-
@mikehurley said in Big list of software that cannot handle spaces or accents in paths:
I've never seen this. Is this some api in the kernel layer?
File descriptors are temporary; this is permanent.
-
@blakeyrat the worst part is, I thought I was being innovative when I invented all of this up in my concept of a perfect operating system some time ago.
-
@gąska said in Big list of software that cannot handle spaces or accents in paths:
@blakeyrat the worst part is, I thought I was being innovative when I invented all of this up in my concept of a perfect operating system some time ago.
We're in an industry where nobody stands on the shoulders of giants. Nobody can improve anything because they're too busy reinventing ideas that someone else already tried and rejected 20 years ago.
You've heard this rant from me before.
-
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
@gąska said in Big list of software that cannot handle spaces or accents in paths:
In a perfect universe where filenames are only ever used to show them in UI, we could have UI handling filenames from start to finish. But someone in 1956 made the stupid decision to make the filename also the unique identifier which everything - from kernel through headless services up to GUI application and all its non-GUI parts - uses to refer to a particular file, and now we're forever stuck with having to deal with filenames at every abstraction level above "a partition is bunch of bytes read and written to the storage device".
Another example of shitty design.
Macintosh was designed to refer to files by their identifier. The filename was just yet another piece of meta-data. So you could do reasonable stuff like this:
- Open "foo.doc" in Word
- Rename "foo.doc" to "bar.doc" in Finder
- Make changes and hit Save in Word
- Your changes are saved to "bar.doc" where they belong, instead of Word writing a new "foo.doc" which is wrong and confusing, like what happens in every other shitty OS.
Systems that treat NAMES of things as IDENTIFIERS of those things, especially things that are expected to be renamed often, are the WTF.
Fuck that. What happens when someone else renames the file? I want the file so stay where it is.
-
@zecc said in Big list of software that cannot handle spaces or accents in paths:
Fuck that. What happens when someone else renames the file?
Well in a modern system, don't give them permissions to it if you don't want them to dick with it.
In the old Macintosh system, yes there were not really any enforced permissions so this could happen.
@zecc said in Big list of software that cannot handle spaces or accents in paths:
I want the file so stay where it is.
The file isn't its name or its path. (Also its name isn't its path.)
If I rename
rose.png
toturd.png
it still smells as sweet.
-
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
The idea that a human being would ever see or, even worse, have to type in a path was completely alien to that platform.
Thankfully we do have at least one modern widely-used filesystem that works properly: Google Drive. Filenames can contain any characters at all, the concept of a path doesn't make sense and you never need to type it, a file can exist in multiple directories at once, etc. although I think unfortunately a file can only have one name, changing the name in one place changes it everywhere :(
-
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
@gąska said in Big list of software that cannot handle spaces or accents in paths:
@blakeyrat the worst part is, I thought I was being innovative when I invented all of this up in my concept of a perfect operating system some time ago.
We're in an industry where nobody stands on the shoulders of giants. Nobody can improve anything because they're too busy reinventing ideas that someone else already tried and rejected 20 years ago.
When you word it like that, it sounds like you're saying my idea is bad because Apple has tried and rejected it already.
-
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
@zecc said in Big list of software that cannot handle spaces or accents in paths:
Fuck that. What happens when someone else renames the file?
Well in a modern system, don't give them permissions to it if you don't want them to dick with it.
There are legitimate cases where you actually want them to be able to rename files you're editing.
-
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
If I rename rose.png to turd.png it still smells as sweet.
Sure, but— actually I can't argue against turd.png being more memorable.
-
@gąska said in Big list of software that cannot handle spaces or accents in paths:
@bulb said in Big list of software that cannot handle spaces or accents in paths:
It's just that the filesystem authors don't care about exercising their right to forbid the problematic ones.
Why would they? It's kernel's job to tell the user to go fuck themselves with their unprintable characters.
I didn't say “drivers” specifically—I kinda included the virtual filesystem layer authors in it.
In the Linux architecture, the virtual filesystem layer should indeed provide default filters for it (that the driver would have to be able to override for ((ass-)backward-)compati(de)bility reasons).
@dkf said in Big list of software that cannot handle spaces or accents in paths:
When someone suggests making the OS simpler and having it do less (leaving all the awkward cases to the GUI) then they get all angry.
When you leave it up to the GUI, you'll
- get inconsistencies between different applications and
- it will be less efficient as the GUI will have to copy parts of the dentry cache that already exists in the kernel (I am assuming it should be a normalization-preserving system—the normalization-enforcing behaviour of OSX would be simpler, but it turned out to be pain in the arse in practice).
-
@bulb said in Big list of software that cannot handle spaces or accents in paths:
@gąska said in Big list of software that cannot handle spaces or accents in paths:
@bulb said in Big list of software that cannot handle spaces or accents in paths:
It's just that the filesystem authors don't care about exercising their right to forbid the problematic ones.
Why would they? It's kernel's job to tell the user to go fuck themselves with their unprintable characters.
I didn't say “drivers” specifically—I kinda included the virtual filesystem layer authors in it.
Right; forgot that filesystem, like UTF-16, means different things to different people.
-
@gąska said in Big list of software that cannot handle spaces or accents in paths:
There are legitimate cases where you actually want them to be able to rename files you're editing.
More importantly, since a file's name isn't the file, it makes no sense that renaming a file you're editing would create a new file at the old name.
-
@gąska said in Big list of software that cannot handle spaces or accents in paths:
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
@mikehurley said in Big list of software that cannot handle spaces or accents in paths:
I've never seen this. Is this some api in the kernel layer?
I think C itself is designed that way, isn't it? If you open a file and keep around the file descriptor, you have an ID to it which is entirely separated from the file's name. It's been ages since I did C.
But to open a file and receive file descriptor, I have to provide the name.
But once you have the descriptor, you don't need the name anymore. And you can get a descriptor in other ways, like via a unix socket, but that's pretty rare.
-
@pleegwat either you never ever have to use filenames for anything at all, or you have to be wary of this naming shitfest. About 100% of software falls in the second category.
-
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
@bulb Yup. It's shitty because everybody in the Linux world is a fucking coward.
Are they garbage people?
The funny thing is when people like me thought "oh Ubuntu is going to swoop in and fix all this shit" and they then proceeded to do... literally nothing at all and within 3 years they were just another Red Hat with a slightly better UI theme. Because even they were too cowardly to make any real fixes.
You spend so much time wringing your hands over this non-problem. Anyways, I know the magic words to make you defend it: Backwards compatibility.
-
@lb_ said in Big list of software that cannot handle spaces or accents in paths:
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
The idea that a human being would ever see or, even worse, have to type in a path was completely alien to that platform.
Thankfully we do have at least one modern widely-used filesystem that works properly: Google Drive. Filenames can contain any characters at all, the concept of a path doesn't make sense and you never need to type it, a file can exist in multiple directories at once, etc. although I think unfortunately a file can only have one name, changing the name in one place changes it everywhere :(
I find it infuriatingly difficult to find stuff in there.
-
@boomzilla said in Big list of software that cannot handle spaces or accents in paths:
@lb_ said in Big list of software that cannot handle spaces or accents in paths:
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
The idea that a human being would ever see or, even worse, have to type in a path was completely alien to that platform.
Thankfully we do have at least one modern widely-used filesystem that works properly: Google Drive. Filenames can contain any characters at all, the concept of a path doesn't make sense and you never need to type it, a file can exist in multiple directories at once, etc. although I think unfortunately a file can only have one name, changing the name in one place changes it everywhere :(
I find it infuriatingly difficult to find stuff in there.
I find it difficult to find stuff anywhere. Seriously, it feels like if you don't spend the extra effort to keep your documents organized, your documents won't be organized.
-
@gąska said in Big list of software that cannot handle spaces or accents in paths:
@boomzilla said in Big list of software that cannot handle spaces or accents in paths:
@lb_ said in Big list of software that cannot handle spaces or accents in paths:
@blakeyrat said in Big list of software that cannot handle spaces or accents in paths:
The idea that a human being would ever see or, even worse, have to type in a path was completely alien to that platform.
Thankfully we do have at least one modern widely-used filesystem that works properly: Google Drive. Filenames can contain any characters at all, the concept of a path doesn't make sense and you never need to type it, a file can exist in multiple directories at once, etc. although I think unfortunately a file can only have one name, changing the name in one place changes it everywhere :(
I find it infuriatingly difficult to find stuff in there.
I find it difficult to find stuff anywhere. Seriously, it feels like if you don't spend the extra effort to keep your documents organized, your documents won't be organized.
But you can search!