Low level programming using system() calls



  •  WTF http://forums.somethingawful.com/showthread.php?threadid=2912907

    I'm writing a program for the lab I work at, and it works fine on PC, but our lab uses apple computers. When I try to run it any of the apples in the lab, it doesn't open the input file. I'm at a loss as to what to do, I'm pretty much a beginner when it comes to programming, so any help would be gratefully appreciated. The segment of code causing the problem is as follows:

    output.open(outname,ios::out);
    input.open(inname,ios::in); //Causes the input file pointer to be directed to the inname file, and opens it as an input stream.
    #ifdef UNIX
    // Mac Only */////////////////////////
    rename(inname,"Nexus.txt"); //Renames input file to Nexus.txt
    input.close();
    system("cat Nexus.txt | tr '\r' '\n' | tr -s '\n' > unixInputFile.txt"); //Converts input file to unix format
    rename("unixInputFile.txt",inname); //Renames the file back to innname
    input.open(inname,ios::in); //Reopens file
    #endif
    //**********************************///

    while (!input.eof())
    {
    getline(input,temp);
    .
    .
    .
    }


  • I don't get it - the code is atrocious, but why shouldn't the clumsy workaround for Mac do exactly what it is supposed to do?

    (I'm not a C programmer.)



  • @Arancaytar said:

    the code is atrocious, but...

    Methinks, *that's* the WTF

    On the other hand, TRWTF is that Apple decided to use different text file line endings than Unix. Not to mention PCs line endings are different, too. Three different line-endings for no good reason. Uck!



  •  so where's the WTF? The guy openly admits that he's a beginner, and acknowledges that his code is sloppy. 



  • @TheRider said:

    TRWTF is that Apple decided to use different text file line endings than Unix.

    OS X, which the OP is talking about, is Unix-like and uses Unix line endings. Still, the solution is obviously a WTF; the proper one, I suppose, would be to either use a CRLF-delimited input on Windows and LF-delimited on Unix in the first place, or use LF-delimited files on both systems and open the file in binary mode.



  • @Spectre said:

    Still, the solution is obviously a WTF; the proper one, I suppose, would be to either use a CRLF-delimited input on Windows and LF-delimited on Unix in the first place, or use LF-delimited files on both systems and open the file in binary mode.
    Or, you can open the file in binary mode and strip carriage returns while reading it. If you do that, files created on either system will still read either system as well.



  • @Spectre said:

    @TheRider said:
    TRWTF is that Apple decided to use different text file line endings than Unix.

    OS X, which the OP is talking about, is Unix-like and uses Unix line endings. Still, the solution is obviously a WTF; the proper one, I suppose, would be to either use a CRLF-delimited input on Windows and LF-delimited on Unix in the first place, or use LF-delimited files on both systems and open the file in binary mode.

    I am working mostly with CRLF (Windows) based text files. I am also quite accustomed to CR-delimited text files from the Unix world. But just one or two months ago I stumbled over files that were LF-delimited, supposedly produced by a Mac. I can't tell whether it was OSX or MacOS or whatever generation of Apple operating systems. This is why I am talking about three types of text files.



  • @TheRider said:

    I am working mostly with CRLF (Windows) based text files. I am also quite accustomed to CR-delimited text files from the Unix world. But just one or two months ago I stumbled over files that were LF-delimited, supposedly produced by a Mac. I can't tell whether it was OSX or MacOS or whatever generation of Apple operating systems. This is why I am talking about three types of text files.

    Unix is LF, Mac OS is CR. 



  • @morbiuswilters said:

    Unix is LF, Mac OS is CR. 

    True, and yet not true. The "Classic" Mac OS (i.e. Mac OS 9 and earlier) used CR. Mac OS X has a bunch of different APIs at different levels, and they behave in different ways. All the command-line and/or POSIX stuff has used LF all along, and Cocoa (i.e. the Nextstep-descended APIs, which Apple wants developers to use but hasn't really pushed hard enough) uses LF for text unless you tell it otherwise. The other APIs generally have behavior defined by how backwards-compatible the calls have to be and how high-level they are. (I think that the really high level stuff, where it is presumed that the programmer just doesn't care but wants to read text, will read either CR or LF but write LF by default. I may be mistaken.)

    Just to confuse things even more, some of the APIs don't default to Unicode, either. (Most of them do, and most of them also allow you to specify an encoding.) Programs which are either really old or written without consideration in non-Unicode APIs may end up using the old Mac OS Roman character encoding, which is the same as ASCII in low-order bytes but almost completely unique in high-order mappings. (It coincides with Windows Latin in exactly 5 high-order mappings, all of them uncommon, which is why text written on Windows and viewed on old/new-but-poorly-written Mac programs -- or vice versa -- will display curly quotes and so on incorrectly.)

    This, ladies and gentlemen, has been an example of "stuff I know despite never wanting to learn it". (Is there a word for that?) Thank you.



  • @The Vicar said:

    (I think that the really high level stuff, where it is presumed that the programmer just doesn't care but wants to read text, will read either CR or LF but write LF by default. I may be mistaken.)

    Last I checked, the really high level stuff is a grab-bag. Some of them write LF, but most write CR.

    The vast majority of the software will recognize both CR and LF as a line end, but there is a certain amount of the unix command-line that doesn't recognize CR endings.

    Disclaimer: I last checked around 10.4 or 10.3.



  • The real way to handle that in C, last I checked, was to open the file in binary and parse the line endings oneself. Now, that was eons ago, and I would seriously hope that someone's come up with a standard library call to handle a change in line endings by now.

    That having been said,

    system("tr -s '\015' '\012' < Nexus.txt > unixInputFile.txt"); //Converts input file to unix format

    would save a couple fork/execve pairs. For all of the unix systems I have been on that have a working strace, that's over 200 system calls, if the system's using shared libraries. Now, I know MacOS optimizes that a bit, so it may just be 60-80 calls, but it's still fairly intense. Since cat and tr use about the same number of system calls to work on those systems I've been able to strace them on, I'd guess this would save about 2/3 of the time required for the whole operation.

    Also, depending on how old those Macs are, and how old their software is... IIRC, MacOS prior to 10.x did a particularly 'clever' stupidism of flipping the meaning of '\r' and '\n', as an attempt to make porting C applications easier. I know they fixed that eventually, but I don't know if they fixed it in 10.0.0, or later.

    Disclaimer: I know *somebody* did that stupidism, but I'm not 100% certain it was Apple. I can't think of any other candidates, however.



  • @tgape said:

    Also, depending on how old those Macs are, and how old their software is... IIRC, MacOS prior to 10.x did a particularly 'clever' stupidism of flipping the meaning of '\r' and '\n', as an attempt to make porting C applications easier. I know they fixed that eventually, but I don't know if they fixed it in 10.0.0, or later.

    Disclaimer: I know *somebody* did that stupidism, but I'm not 100% certain it was Apple. I can't think of any other candidates, however.

    Macs just used CR to mean "end of paragraph" and replaced LF with a rectangle, or ignored it, depending on the typeface being used.

    The Mac OS, until Mac OS X, had no internal text-based scheme for dealing with the OS, and thus no cursor control system at all. (If you never need to manipulate a cursor, there is no need for differentiation between CR and LF.) With Mac OS X, Apple completely abandoned the old Mac OS innards -- Mac OS X looks vaguely like the old Mac OS, but underneath, it's Nextstep's BSD Mach kernel. Really, the only reason 10.0 was the Mac OS is because Apple said it was. About 75% of what you knew about the "Classic" Mac OS is wrong on Mac OS X, and the remaining 25% is usually because Apple winched in a compatibility layer to keep existing Mac owners from rebelling.

    The irony of all this is that the most logical system of dealing with cursor control is (though it galls me to say this) the one in DOS and Windows. "Carriage Return" and "Line Feed" come from typewriters and teletypes: LF meant move down one line, CR means move to the first position in the line without moving vertically. To start a new line would require both a CR and an LF. (The oldest typewriters had a lever you had to push to advance a line every time you pressed return. You might argue that since later typewriters changed the "return" key to do both at once, the Windows/DOS system is not valid. But in that case, the Mac had the right model by using just CR, and it's the Unix-y LF that is in the wrong.)

    Theoretically, this is all moot, because we're all supposed to be using Unicode for text these days, and Unicode has both a line separator and a paragraph separator character (0x2028 and 0x2029) which ought to settle the question by making the old incompatible system obsolete. But almost nobody uses the Unicode versions, and many Unicode implementations can't interpret them properly anyway, so we all get to argue over CR and LF, apparently for all eternity.


Log in to reply