Actually using unicode in the filepath
-
Just looking for something on my computer and found this gem:
C:\Users\Helix\AppData\Local\Logitech® Webcam SoftwareI guess since they have '®' in the path they assume it is only installed on NTFS systems.
-
@Helix said in Actually using unicode in the filepath:
I guess since they have '®' in the path they assume it is only installed on NTFS systems.
® is in Latin-1 and CP1252. Is that really a problem with FAT? Holy fuck.
-
@LaoC said in Actually using unicode in the filepath:
Latin-1 and CP1252
Sorry, just looked up:
"On newer file systems, such as NTFS, exFAT, UDFS, and FAT32, Windows stores the long file names on disk in Unicode, which means that the original long file name is always preserved. This is true even if a long file name contains extended characters, regardless of the code page that is active during a disk read or write operation.
Files using long file names can be copied between NTFS file system partitions and Windows FAT file system partitions without losing any file name information. This may not be true for the older MS-DOS FAT and some types of CDFS (CD-ROM) file systems, depending on the actual file name. In this case, the short file name is substituted if possible."
For some reason I thought that only NTFS supported unicode.
-
@Helix said in Actually using unicode in the filepath:
@LaoC said in Actually using unicode in the filepath:
Latin-1 and CP1252
Sorry, just looked up:
"On newer file systems, such as NTFS, exFAT, UDFS, and FAT32, Windows stores the long file names on disk in Unicode, which means that the original long file name is always preserved. This is true even if a long file name contains extended characters, regardless of the code page that is active during a disk read or write operation.
Files using long file names can be copied between NTFS file system partitions and Windows FAT file system partitions without losing any file name information. This may not be true for the older MS-DOS FAT and some types of CDFS (CD-ROM) file systems, depending on the actual file name. In this case, the short file name is substituted if possible."
For some reason I thought that only NTFS supported unicode.
So you are admitting that you are TRWTF?
(Also: my home machines are littered with files whose names contain French accents and similar nonsense. But they are on NTFS. Well, except for the old disks I still have from my Win9x days, which are formatted in FAT32.)
-
@LaoC said in Actually using unicode in the filepath:
@Helix said in Actually using unicode in the filepath:
I guess since they have '®' in the path they assume it is only installed on NTFS systems.
® is in Latin-1 and CP1252. Is that really a problem with FAT? Holy fuck.
No, see the notes about long filenames being in Unicode on FAT systems. But the short names (created automagically by the operating system and don't fucking mess with them if you know what's good for you) are sharply limited, to 8.3 format, no spaces, only a few punctuation characters allowed, not case preserving (upper case only), no non-ASCII characters allowed.
For whatever reason, the two sorts of name are usually called "long" and "short", but in some unusual cases, the long name can be shorter than the short name.
Notably: the six-character long name
a.html
will become the seven-character short nameA~1.HTM
on Windows 9x/FAT32, and if you have a collection of names froma.htma
up toa.htmz
created in alphabetical order of the last character of the extension, the short name ofa.htmz
will beA~26.HTM
...
-
I just tried experimenting a bit with short names on my machine. I made a file
.a
which has the short nameA67ED~1
.
-
@hungrier huh. maybe i could use short path shenanigans to solve the problem that i have two folders on my SAMBA share named
documents
andDocuments
and windows clients cannot seem to accessDocuments
because they always end up insidedocuments
-
@accalia What about the far more sensible solution of not having two folders that differ only in capitalization?
-
@hungrier said in Actually using unicode in the filepath:
@accalia What about the far more sensible solution of not having two folders that differ only in capitalization?
they were created on Linux and i'm too lazy to fix them.
-
@accalia said in Actually using unicode in the filepath:
@hungrier huh. maybe i could use short path shenanigans to solve the problem that i have two folders on my SAMBA share named
documents
andDocuments
and windows clients cannot seem to accessDocuments
because they always end up insidedocuments
to fix even more .
-
@anotherusername said in Actually using unicode in the filepath:
@accalia said in Actually using unicode in the filepath:
@hungrier huh. maybe i could use short path shenanigans to solve the problem that i have two folders on my SAMBA share named
documents
andDocuments
and windows clients cannot seem to accessDocuments
because they always end up insidedocuments
to fix even more .
because it's fun do when it's for personal.
just not for personal pleasure as @Perverted_Vixen found out.
-
@accalia well, if
dir /x
shows short filenames for them, you could always try.
-
@accalia said in Actually using unicode in the filepath:
just not for personal pleasure as @Perverted_Vixen found out.
You mean you two engage in "bad software design" as a BDSM kink?
-
@anonymous234 said in Actually using unicode in the filepath:
@accalia said in Actually using unicode in the filepath:
just not for personal pleasure as @Perverted_Vixen found out.
You mean you two engage in "bad software design" as a BDSM kink?
It certainly satisfies the S&M part
-
@accalia Have you tried running a "check disk for errors"? Sometimes it fixes those invalid name problems.
-
@anonymous234 said in Actually using unicode in the filepath:
@accalia Have you tried running a "check disk for errors"? Sometimes it fixes those invalid name problems.
you mean i should run fdisk on my linux based NAS server that is exporting the directories under a SAMBA share?
it comes up clean becuase that's valid for EXT4 file sytems.
-
@accalia I think the short name thing doesn't exist on NTFS, but I'm not sure
-
@wharrgarbl It exists, hence this article about turning it off:
…and the onebox of the URL displays… the URL
-
@wharrgarbl based on the quick research that I did, it looked like Samba actually has to create short filenames for backward compatibility purposes.
-
@accalia said in Actually using unicode in the filepath:
@hungrier said in Actually using unicode in the filepath:
@accalia What about the far more sensible solution of not having two folders that differ only in capitalization?
they were created on Linux and i'm too lazy to fix them.
On linux:
mv documents docs
done. :P
-
CP437 should be enough characters for any purpose.
Look, here's a dead grasshopper emoji: ²
-
@ben_lubar said in Actually using unicode in the filepath:
Look, here's a dead grasshopper emoji: ²
He is probably serious and this is used in dwarf fortress
-
@wharrgarbl said in Actually using unicode in the filepath:
@ben_lubar said in Actually using unicode in the filepath:
Look, here's a dead grasshopper emoji: ²
He is probably serious and this is used in dwarf fortress
Or possibly ZZT.
-
@Arantor said in Actually using unicode in the filepath:
Or possibly ZZT.
Nah, in ZZT things disappear when you kill them.
-
@Scarlet_Manuka said in Actually using unicode in the filepath:
@Arantor said in Actually using unicode in the filepath:
Or possibly ZZT.
Nah, in ZZT things disappear when you kill them.
What if it's a dead grasshopper that's left there for the purposes of being part of a puzzle?
-
@Arantor ZZT puzzles aren't that sophisticated.
-
@Scarlet_Manuka said in Actually using unicode in the filepath:
@Arantor ZZT puzzles aren't that sophisticated.
In the base game? Maybe not. In a user made level? Why not?
-
@hungrier said in Actually using unicode in the filepath:
I just tried experimenting a bit with short names on my machine. I made a file
.a
which has the short nameA67ED~1
.Yeah, that's the other unusual case that I know of, a filename that (in effect) has no name, only an extension, in the long name, becomes a weird name-only file in the short case(1). (Yeah, I know, on UNIX it means something else.) But I didn't have an NTFS / FAT32 system to hand to see what it would do.
(1) YMMV if you create for example
.a.b
, of course.
-
@Steve_The_Cynic said in Actually using unicode in the filepath:
YMMV if you create for example
.a.b
, of course.Live life on the edge! ;)
-
I made a file named
A67ED~1
and then I made a file named.a
. The short name for the latter ended up asA67ED~2
, and the former has no short name listed.