WTF Bites


  • Notification Spam Recipient

    @Arantor said in WTF Bites:

    The argument came down to 'well if you're moving binary data inmaking assumptions about files with or without an extension, you're doing it wrong'.

    🔧



  • @Tsaukpaetra yes but that’s not what FileZilla guy meant, and he was very much doing it wrong.


  • Discourse touched me in a no-no place

    @Arantor said in WTF Bites:

    @topspin case folding and Unicode normalisation are wildly different topics.

    Interestingly this is a behaviour that changed in Apple land HFS+ would normalise (to form D if memory serves, not that it explicitly matters, the key point is that it normalises), while APFS doesn’t.

    Macs want you to use NFD whereas both Windows and Linux really prefer NFC. None of them totally enforce it in the filesystem layer because the normalization algorithm is large and grubby; it's better to keep that size of thing in user space.

    That's right. It is the job of the file transfer client to fix things. That an FTP client doesn't... is not very surprising. (Text vs binary mode for FTP mattered a lot when dealing with transfers to and from IBM mainframes. Not many new FTP clients worth a damn have been written since then.)



  • @dkf You're telling me that FileZilla is shit after all? Colour me surprised...


  • Discourse touched me in a no-no place

    @Arantor said in WTF Bites:

    @dkf You're telling me that FileZilla is shit after all? Colour me surprised...

    Yes. FTP itself is also shit. A match made in Hell Dagenham.



  • @topspin said in WTF Bites:

    I used FileZilla to scp some files from my Mac to a remote machine with NFS storage. Instead of overwriting / merging an existing folder, it managed to create two folders with identical names. Because apparently Unicode normalization isn't a thing. Or something... Who knows.

    Mac does Unicode normalization the other way than basically everybody else usually encodes Unicode.

    1. Most software, UI libraries and locale implementations encode characters in the composed form, because it's easier (the key combination only produces one character, not two) and backward compatible (it used to be one character in the legacy encodings too).
    2. Linux treats filenames as strings of bytes, not in any particular encoding, so it just keeps what it got. Linux motto is “policy does not belong in the kernel” and encoding of filenames is a policy.
    3. So do most network protocols like http (including webdav), ftp and sftp. The protocols predate unicode, and consider the meaning of the filenames none of their business.
    4. While Windows does treat filenames as Unicode, it does not implement either normalization nor normalization-insensitive comparison (as demonstrated on this site multiple times already). Given they compare filenames case insensitively, but it does not work on accented characters, I'm calling it a :wtf:
    5. But MacOS insists on normalizing everything to the decomposed form. If they just implemented normalization-insensitive comparison, but stored whatever they've got, I'd call it a sane approach, but I'm definitely calling what they did :trwtf:.

    The result is that when you store a file named with accented characters on a Mac, the name almost always changes. And then you copy it back to Windows or Linux and of course it will be a different file (or directory).



  • @topspin said in WTF Bites:

    Also, I’m no expert on this, but shouldn’t the file system driver take care of normalizing file names?

    No, it shouldn't. The reason is that you've got a lot of legacy apps that don't understand normalization and may be broken if what they get back is different from what they put in, so the filesystem driver should always store what it got. It would make sense for the filesystem driver to compare the names normalization-insensitively though.

    Unlike what I said about passwords ⬆, file names shouldn’t be treated as binary blobs. Ideally, the file system should be case-insensitive case-preserving. Surprisingly, WindowsNTFS gets this right.

    Except it does not. Windows store what they get, but the case insensitivity was never properly updated for Unicode, so it does not work correctly for non-English names, and it is still normalization sensitive, which is just plain wrong once you say the names are Unicode—because while case sensitivity is a matter of taste, differently normalized strings should always be considered equal in Unicode since the user can't even ever distinguish them, and can't usually even type the other form at all.



  • @dkf said in WTF Bites:

    @Arantor said in WTF Bites:

    @dkf You're telling me that FileZilla is shit after all? Colour me surprised...

    Yes. FTP itself is also shit. A match made in Hell Dagenham.

    Note that no FTP is involved though, because SFTP has absolutely no relation to FTP (and always does a binary transfer—which includes filenames).

    IMNSHO the correct approach would be if everybody implemented normalization-insensitive comparison and otherwise stored what they've got, but every system did something else and now it's up to the users to deal with the mess because no matter what the software will do, someone will complain.


  • BINNED

    @Bulb said in WTF Bites:

    @topspin said in WTF Bites:

    Also, I’m no expert on this, but shouldn’t the file system driver take care of normalizing file names?

    No, it shouldn't. The reason is that you've got a lot of legacy apps that don't understand normalization and may be broken if what they get back is different from what they put in, so the filesystem driver should always store what it got. It would make sense for the filesystem driver to compare the names normalization-insensitively though.

    Maybe so, but it definitely shouldn't allow identical file names that only differ in encoding.



  • @Bulb said in WTF Bites:

    Linux motto is “policy does not belong in the kernel” :kneeling_warthog:

    🔧


  • Considered Harmful

    @dkf said in WTF Bites:

    Macs want you to use NFD whereas both Windows and Linux really prefer NFC. None of them totally enforce it in the filesystem layer because the normalization algorithm is large and grubby; it's better to keep that size of thing in user space.

    Linux doesn't enforce anything at all, not even with the ntfs3g driver (no idea whether the original behaves the same, and :kneeling_warthog:):

    > perl -MEncode -mutf8 -MUnicode::Normalize \
    -E'$s=decode("utf8","naïve");
    open $f, ">", $_ or die "$!" for NFD($s), NFC($s)'
    > ls na*
    naïve  naïve
    > ls na*|xxd
    00000000: 6e61 69cc 8876 650a 6e61 c3af 7665 0a    nai..ve.na..ve.
    

  • Considered Harmful

    @topspin said in WTF Bites:

    @Bulb said in WTF Bites:

    @topspin said in WTF Bites:

    Also, I’m no expert on this, but shouldn’t the file system driver take care of normalizing file names?

    No, it shouldn't. The reason is that you've got a lot of legacy apps that don't understand normalization and may be broken if what they get back is different from what they put in, so the filesystem driver should always store what it got. It would make sense for the filesystem driver to compare the names normalization-insensitively though.

    Maybe so, but it definitely shouldn't allow identical file names that only differ in encoding.

    Much :fun: to be had with subtle incompatibilities depending on which Unicode version your FS was built for. E.g. the Cyrillic Extended-A block didn't exist before 2008 so older normalization algorithms wouldn't know what to do with its combining marks (user software may well have used the characters long before they were known to the OS!) and would just leave them alone, leading to file systems that would suddenly be in a "forbidden" state because you used a newer FS module. You'd basically have to scan your disk every time you upgraded your kernel and be prepared to manually decide which of the files to keep that are no longer allowed in the same directory together.



  • @LaoC said in WTF Bites:

    @dkf said in WTF Bites:

    Macs want you to use NFD whereas both Windows and Linux really prefer NFC. None of them totally enforce it in the filesystem layer because the normalization algorithm is large and grubby; it's better to keep that size of thing in user space.

    Linux doesn't enforce anything at all,

    Of course not.

    @Bulb said in WTF Bites:

    Linux treats filenames as strings of bytes, not in any particular encoding, so it just keeps what it got. Linux motto is “policy does not belong in the kernel” and encoding of filenames is a policy.

    @LaoC said in WTF Bites:

    not even with the ntfs3g driver (no idea whether the original behaves the same, and :kneeling_warthog:):

    Windows are normalization-sensitive. They don't enforce anything either, though in their cases it is a :wtf:, because, unlike Linux, Windows do declare their filenames are Unicode strings.


    @topspin said in WTF Bites:

    @Bulb said in WTF Bites:

    @topspin said in WTF Bites:

    Also, I’m no expert on this, but shouldn’t the file system driver take care of normalizing file names?

    No, it shouldn't. The reason is that you've got a lot of legacy apps that don't understand normalization and may be broken if what they get back is different from what they put in, so the filesystem driver should always store what it got. It would make sense for the filesystem driver to compare the names normalization-insensitively though.

    Maybe so, but it definitely shouldn't allow identical file names that only differ in encoding.

    Maybe it shouldn't. But legacy systems are a fact of life, and if MacOS just implemented normalization-insensitive comparison, but kept the original form around, it wouldn't be causing any problems on them for you. But it chose forcing normalization and that causes problems with the legacy systems, and that makes MacOS :trwtf:.


  • Considered Harmful

    @Bulb said in WTF Bites:

    @LaoC said in WTF Bites:

    @dkf said in WTF Bites:

    Macs want you to use NFD whereas both Windows and Linux really prefer NFC. None of them totally enforce it in the filesystem layer because the normalization algorithm is large and grubby; it's better to keep that size of thing in user space.

    Linux doesn't enforce anything at all,

    Of course not.

    @Bulb said in WTF Bites:

    Linux treats filenames as strings of bytes, not in any particular encoding, so it just keeps what it got. Linux motto is “policy does not belong in the kernel” and encoding of filenames is a policy.

    Yeah, :hanzo: . I know the usual "native" FS don't but they could decide to do it so I was curious if ntfs3g would.


  • Discourse touched me in a no-no place

    @Bulb said in WTF Bites:

    Note that no FTP is involved though, because SFTP has absolutely no relation to FTP (and always does a binary transfer—which includes filenames).

    I took him at his word. SFTP is an entirely different beast, with nothing in common with FTP except some letters in its name and a tendency to LARP as an FTP client.

    There was also FTPS. Which was FTP plus SSL, at least for the command channel; I can't remember if the data channel was also encrypted. Yes, it was a real shitshow. I remember some people even did a parallel extension to that... which was great until someone exhibited the same sort of tech working for HTTPS downloads with an unmodified webserver... and got better bandwidth too. FTP was terrible, and that is dumb because it is so simple... to the point where it can't do any multiplexing and so bears network synch costs for lots of things.

    SFTP multiplexes over SSH.



  • @dkf The original post said FileZilla with SCP. I then lamented other stupid things FileZilla does.


  • BINNED

    Some helpful directions:

    append [blah] at the top of the document



  • @dkf said in WTF Bites:

    There was also FTPS. Which was FTP plus SSL, at least for the command channel; I can't remember if the data channel was also encrypted. Yes, it was a real shitshow.

    Colleague implemented uploading of releases over that to some customer file-sharing abomination of a server. He ended up using lftp, because it was the only headless client (the point was doing the upload from the release build on the build server) that even supported that protocol, and he had to do some retries and splitting to multiple volumes because a single archive weighing a couple of gigabytes was virtually guaranteed to fail to upload.

    He implemented it some time around 2019 and I believe it's still in use. It isn't really an old tech, just using old protocol for all the wrong reasons.

    I remember some people even did a parallel extension to that... which was great until someone exhibited the same sort of tech working for HTTPS downloads with an unmodified webserver... and got better bandwidth too.

    And yet a lot of people still use ftp instead of webdav.

    FTP was terrible, and that is dumb because it is so simple

    FTP is way too complicated for what it's doing

    • For some reason that I don't believe was any more sensible back then than it is now, the command channel does not work over a plain TCP connection, it works over a telnet connection. Of course some obscure ftp servers insist on sending whatever telnet commands at stupid times, which throws some ftp clients off. Or the other way around.
    • It is a protocol from a more fragmented age when EBCDIC still roamed the world and Unix was an office package, so it has this text mode that of course usually ends up doing the wrong thing these days.
    • It is also a protocol from a more naive age when IP addresses were plenty and masquerades were a pipe dream in the wicked minds of security officers so it had to get this passive mode bolted on, but even that is a lot of pain in the donkey for a typical modern deployment where everything is hidden behind a reverse proxy.
    • And it can't be used from the web because it ain't got no origin.

    Filed under: did anybody define and implement ftp-over-websocket yet?


  • Discourse touched me in a no-no place

    @Bulb The worst problem with FTP (apart from how it uses sockets) is that directory listings just call ls (or DIR on a Windows server). Delimiters? Pah!



  • @dkf … which means the touted benefit of FTP—that it includes directory listing in the specification, while HTTP does not—does not actually exist, because the Apache de-facto standard for generating directory listings in html or json is actually easier to parse than the de-iure standard in FTP…



  • @dkf … which reminds me, :wtf: do people keep coming up with their own ❄ http extensions for file access like

    that is used in

    … in addition to, yes, FTP

    :wtf:

    And all that while even Windows Explorer does support WebDAV and uses it for accessing sharepoint and onedrive.


  • Banned

    @Bulb this isn't really an HTTP extension as much as a protocol built on top of HTTP. The difference is that some hosted/sandboxed environments (most notably web browsers) don't allow arbitrary network connections and everything must go through the HTTP layer, which means you cannot use WebDAV unless the platform specifically made WebDAV accessible, which it usually doesn't. But if your protocol is entirely within bounds of HTTP, then it works just fine.

    Of course, none of this has ever been considered by the author, and the actual reason they made it is a combination of NIH, CADT, and "when all you have is Express.JS..."


  • Discourse touched me in a no-no place

    @Gustav said in WTF Bites:

    everything must go through the HTTP layer, which means you cannot use WebDAV unless the platform specifically made WebDAV accessible, which it usually doesn't

    WebDAV is an HTTP extension. It's a bunch of extra verbs and some document types.


  • Banned

    @dkf And the browser API doesn't let you use those extra verbs. But it does let you use POST to various endpoints for various purposes.


  • Discourse touched me in a no-no place

    @Gustav said in WTF Bites:

    And the browser API doesn't let you use those extra verbs.

    As far as I can see from reading the spec, it does support it (this describes CHICKEN as a valid method verb) but the server probably needs to whitelist them.



  • Why are you expecting Node hipster devs to do anything as weird as following a published spec when they can reinvent it badly as they see fit?


  • Discourse touched me in a no-no place

    @Arantor See also 200 ERROR and 500 OK.



  • @dkf I’ve also seen 400 OK and 502 OK in the wild, not to mention seeing folks send 401 when they mean 429…


  • Considered Harmful

    @Arantor said in WTF Bites:

    @dkf I’ve also seen 400 OK and 502 OK in the wild, not to mention seeing folks send 401 when they mean 429…

    420 WebDAVe's not here, man!



  • @LaoC said in WTF Bites:

    420 WebDAVe's not here, man!



  • @Bulb needs more John Spartan.


  • Discourse touched me in a no-no place

    @Arantor said in WTF Bites:

    @Bulb needs more John Spartan.

    403 Seashells


  • Considered Harmful

    Status: Disappointed. Not sure what I expected. Serves me right for expecting anything at all, I guess.

    So I tried out this newfangled Azure Data Studio thing, which apparently has very little to do with actual Azure, except marketing insisted. I wanted something with MSSQL and Postgres support. Seemed alright. Until I actually selected the default 1000 rows from the request diagostic table, which took 25 seconds to run. In the meantime in SSMS 2000 rows for the same table takes 3 seconds and scrolls without white patches that are sluggishly filled afterwards.
    On the other hand, on cold start SSMS takes 15 seconds to show the splash screen and... hold on, this thing takes 15 seconds to connect to the server for some arcane reason. What the fuck is it doing?

    WHY DOES ELECTROON SUCK SO HARD? :angry:


  • BINNED

    @Applied-Mediocrity Spectate Swamp Media Search? :sideways_owl:


  • Considered Harmful

    @topspin Sure, why not. Can't expect you FOSS-tards (:trollface:) to know Microsoft terminology. It isn't cool anymore, anyways. It's all Azhoor now.


  • Discourse touched me in a no-no place

    @Applied-Mediocrity said in WTF Bites:

    What the fuck is it doing?

    Did the DB really exist before that or did a VM have to be launched to contain it? Was the server address in DNS? Was the database driver downloaded (and inspected by Windows Defender) before it was used? Did SSMS have to ask an AI for permission to let you connect to the DB?


  • Considered Harmful

    @dkf All our databases are right where we left them :mlp_smug:



  • @Applied-Mediocrity said in WTF Bites:

    So I tried out this newfangled Azure Data Studio thing, which apparently has very little to do with actual Azure, except marketing insisted.

    Only 1 of 5 pikachus is surprised.

    WHY DOES ELECTROON SUCK SO HARD? :angry:

    It's very easy to fall into some performance trap when writing shit in html+js. It is usually possible to avoid them, but it requires thought and understanding, the very things that Microsoft hoped Electron will allow it to dispense with. So of course they just :shipit:.



  • @Applied-Mediocrity said in WTF Bites:

    which apparently has very little to do with actual Azure, except marketing insisted

    Side note: I occasionally have to remind people to distinguish between Azure (the cloud) and Azure DevOps (formerly known Visual Studio Team Services), because while the later does run on Azure the cloud, it is a separate technology and knowing one does not imply knowing the other.



  • @Bulb said in WTF Bites:

    It's very easy to fall into some performance trap when writing shit in html+js

    You mean like getting 1000 rows from one table, then iterate over them in order to query the database for a row from another table for each single one of them?


  • Considered Harmful

    ssms.splash.png

    (yes, there is ssms.splash.png which I've now edited for my own amusement)



  • @BernieTheBernie You can fall in that trap in any framework. I mean the traps like dynamically rebuilding the widget tree, i.e. dom, on various whims and such, because instead of one standard table view like classic GUI frameworks there are dozens of them for html and 97.8% of them are broken.


  • Discourse touched me in a no-no place

    @Bulb said in WTF Bites:

    one standard table view

    5ea47d7a-e8af-4b2c-b68c-2d6845df6604-image.png



  • @Bulb I feel like you are rounding downwards.


  • Grade A Premium Asshole

    A couple of weeks ago my wife said that she thought we should get a new mattress. Fine, whatever, go ahead and order one.

    This leads into her asking me lots of questions to which my only real reply was: "I don't really care, but whatever you order should be a firm or semi-firm model."

    One of the other questions that I really should have more adequately clarified was whether or not I thought that we needed to also order box springs to go with the mattress. I told her that no we did not need them because of the design of our bed frame. Which is true. But what she heard was, "No, because of our bed frame we don't really need them, but it might be more comfortable if we had them." Because spousal hearing. (This goes for both sides and depends on the discussion at the time.)

    So she orders the box springs. I should have clarified that not only do we not need them because of the design of our bed frame, but we also cannot use them, because stack height. To compound this issue she ordered what appears to be the thickest mattress currently available. It looks like something a stunt man might use to break his fall when jumping from a 12 story building.

    So they deliver all of this and take away our old mattress and I go into the bedroom to check out the new mattress and no bullshit the top of the mattress is over 4' from the floor. It looks ridiculous.

    Okay, so we need to return the $300 "box springs", which are not box springs, they are just fabric covered plywood boxes. No springs.

    This is where things took a sharp turn to :wtf: territory. We cannot return the box springs. Or, more correctly, we cannot return just the box springs. Because of the way their ordering system works, or maybe how they enter products into it, the mattress and box springs are essentially one item once the order is placed. So they are going to have to return the mattress and box springs and deliver a new mattress on its own. Since each item is tracked individually with its own barcode, but still becomes one item, or whatever, they cannot just trick the system into allowing the delivery people to have an empty order with special instructions to just pickup the box springs. I even offered to bring the box springs to them for a refund, but they cannot do that either.

    They are literally going to come and get a brand new mattress and box springs and replace them with another brand new mattress that is just days old and then return those two items to the warehouse where I presume they will be sold as used or something. Which is a colossal waste of money and resources on their part.

    Oh, and of course I have to buy the new mattress and then wait on the refund from the previous purchase. Now that I type that I am wondering if they do things like this so people just won't return things?


  • Notification Spam Recipient

    @Polygeekery said in WTF Bites:

    So they deliver all of this and take away our old mattress and I go into the bedroom to check out the new mattress and no bullshit the top of the mattress is over 4' from the floor. It looks ridiculous.

    Aww man, you really should have taken a picture!



  • @Polygeekery but in the meantime is it comfortable?


  • Notification Spam Recipient

    @Polygeekery said in WTF Bites:

    $300 "box springs"

    Wait, WHAT?

    They're sub-200 on Amazon!

    Granted that's still exorbitantly expensive for a few pieces of wood stapled together, but holy shit!


  • Grade A Premium Asshole

    @Arantor said in WTF Bites:

    @Polygeekery but in the meantime is it comfortable?

    Extremely.


  • Grade A Premium Asshole

    @Tsaukpaetra said in WTF Bites:

    exorbitantly expensive for a few pieces of wood stapled together

    Apparently you haven't seen the prices of lumber lately. Around here the prices are 3-4X what they were before the toilet paper apocalypse. Or that is roughly the increase in price for dimensional construction lumber. Hardwoods and exotics are only about 2X. I used to think that paying $8/bdft for black walnut was expensive. I just looked at my hardwood dealer's website and it is currently going for $15.95/bdft.


Log in to reply