Powershell Out-File WTF
-
I needed to modify some files with powershell, so I figured it'd be pretty simple:
https://technet.microsoft.com/en-us/library/Hh849882.aspxGreat, there's a cmdlet that handles exactly what I need.
Then I recieved errors when passing the resulting file to the application that was trying to read it. Errors about null characters.
WTF 1: It wrote the file in UCS-2. Isn't UTF-8 standard for everything these days?So I can just change it to UTF-8, right? Well..
WTF 2: You can't, with native powershell, anyway. Only with clunky .net calling syntax.
If you assume that, since this is dated 2011, that they must have fixed it by now, guess again.
WTF 3: https://connect.microsoft.com/PowerShell/feedbackdetail/view/1137121/add-nobom-flag-to-out-file - Dated 2015
WTFS 4-99: The reason I'm doing this in the first place. I cba to write it up, though.
-
Microsoft has an inexplicable Utf16/UCS-2 boner. Mostly because they did the unicode conversation for Windows early enough that utf8 wasn't the clear winner yet, and now they have this whole OS built around the idea of 2 byte characters.
-
And they can't change it without breaking hundreds of programs relied on by thousands of enterprise customers
-
Well, even if the whole OS is built around it, you could get with the times when creating a brand new product like powershell...
Thankfully it seems that the UTF-8 BOM doesn't cause issues this time. Had some trouble with it in the past.
-
That an ALL-CAPS invented types to save on keystrokes, like TCHAR, LPCTSTR, ... -y ugly design that helps no one.
-
How's that different from any other shell language? You get basic features out of the box, and when you want to do something more clever, you call out to the externals - executables in GNU/Linux, .NET libraries in PS.
Sure, Powershell packs a little more heat than bash, but you're still kinda expected to call to .NET when the shell isn't enough. And the syntax is really not rocket surgery, if a little uglier than just calling an executable.
Plus I'm pretty sure Powershell can be extended with custom cmdlets, though AFAIR that's much more of a hassle than necessary.
-
you could get with the times when creating a brand new product like powershell
Which is built on .NET, which uses UTF-16/UCS-2 internally. And that is built on top of Win32, which uses UTF-16/UCS-2 internally.Basically, however you look at it, MS inadvertently screwed themselves over.
-
Isn't it still valid UTF-8 with the BOM though?
-
Yes, but quite a few programs don't handle it very well
-
How's that different from any other shell language?
I'm pretty sure that most shell languages capable of writing strings to files support UTF-8 without byte order marks.
I mean, even BAT supports it, if you allow the use of
echo
.
-
Which is built on .NET, which uses UTF-16/UCS-2 internally. And that is built on top of Win32, which uses UTF-16/UCS-2 internally.
They're hardly unique in this. The JVM does the same thing internally. And I don't trust the Unicode people to manage to keep characters in the range up to U+10FFFF either. They've broken that sort of promise before…