PL/I and C



  • Ancient history again, but there's a lesson for all times.

     One team designed PL/I.  They wanted to define a character sequence for comments.  It had to be a character sequence that would not otherwise be used.  How about:  slash asterisk!   /* This is a comment in PL/I  */

    The other team designed the operating system.  They wanted to define a character sequence to mark end-of-file in a card deck.  It had to be a character sequence that would not otherwise be used.  How about:  slash asterisk!

    // DATA
    1
    2
    3
    /*    This card marks end of data.

    Oops.

    Slash-asterisk only marks end of data in columns 1 and 2.  OK, how about we say that EVERY PL/I statement must start in or after column 2?  Problem solved!

    Then we can use column 1 for carriage control!

      /* This will be on the first line.  */
      /* This will be on the next line.  */
    1/* This line will appear on the top of a new page. */
    -/*   This line will overwrite the previous line.  Really useful, huh? */

    Years pass.  Multics is written in PL/I.  Unix is inspired by Multics.  A language is designed for Unix.  Wow, it looks a little like  ... PL/I?

    Only now we don't have to worry about slash-asterisk meaning end-of-file.  We can use slash-asterisk for comments because it doesn't have any other meaning in the language or the operating system.

     

    int a = 3;
    int b = 9;
    int *pa = &a;
    int c = b /*pa
        /* divide 9 by 3 and add 1 */
        + 1;

    Hey, why did I get 10 and not 4?

     



  • And I guess you'd expect:

    i+ +;

    to auto-increment?

    Although, I'd have to admit, that'd be one head-slammer to find in your code...



  • Know the power of syntaxis highlighting!



  • Hence why white space is important in your code.

    If the following comment were not there you would get either 9 or a compiler error as there woud be no terminating semicolon.

    This should more properly be written as:

    int c = b / *pa

    Note the space between the / and the *, this is no longer a comment.



  • and here I forgot my own terminating semicolon.  It must be the holidays.



  • [quote user="KattMan"]

    Hence why white space is important in your code.

    If the following comment were not there you would get either 9 or a compiler error as there woud be no terminating semicolon.

    This should more properly be written as:

    int c = b / *pa

    Note the space between the / and the *, this is no longer a comment.

    [/quote]I always thought C wasn't supposed to care about whitespace...but I guess this shows I'm wrong.



  • @m0ffx said:

    I always thought C wasn't supposed to care about whitespace...but I guess this shows I'm wrong.

    intmain(intargc,charargv){return0;}//bin/true/



    Good luck finding a language that doesn't care about whitespace at all (turing tarpits notwithstanding). I think most use whitespace to separate tokens in one way or another, it just happens that C has /* as a token.



    The point about C "not caring about whitespace" is, as long as you have whitespace where it's compulsory, you can use as much of it as you want. Unlike, say, python.



    Digressing slightly, I think you could also say int c = b//
    /*pa in the above program if you didn't want any whitespace. As long as you're in C89, anyway.



  • [quote user="m0ffx"]I always thought C wasn't supposed to care about whitespace...but I guess this shows I'm wrong.
    [/quote]

    The C compiler doesn't.  That wack-o C pre-processor, on the other hand....



  • Let's not forget C++ and templates:

    std::map<int, std::stack<std::deque<std::string>>,  less<int>> myMap;

     Syntax error?  Where?

     



  • [quote user="Irrelevant"][quote user="m0ffx"]I always thought C wasn't supposed to care about whitespace...but I guess this shows I'm wrong.[/quote]
    intmain(intargc,char**argv){return0;}//bin/true/



    Good luck finding a language that doesn't care about whitespace at all (turing tarpits notwithstanding). I think most use whitespace to separate tokens in one way or another, it just happens that C has /* as a token.[/quote]


    Perl.



  • @Carnildo said:

    [quote user="Irrelevant"][quote user="m0ffx"]I always thought C wasn't supposed to care about whitespace...but I guess this shows I'm wrong.

    intmain(intargc,char**argv){return0;}//bin/true/



    Good luck finding a language that doesn't care about whitespace at all (turing tarpits notwithstanding). I think most use whitespace to separate tokens in one way or another, it just happens that C has /* as a token.[/quote]

    Perl.
    [/quote]

    usestrict;usewarnings;useconstantHEADS=>"heads\n";useconstantTAILS=>"tails\n";printSTDOUTHEADSandexitifintrand2;printSTDOUTTAILS;


  • [quote user="newfweiler"]

    Let's not forget C++ and templates:

    std::map<int, std::stack<std::deque<std::string>>,  less<int>> myMap;

     Syntax error?  Where?

    [/quote]

    This is a bug in your compiler. Admittedly, it's a bug in just about every compiler, and it's a bug which the C++ specification explicitly notes and excuses - but nonetheless, it's a bug. There is a known fix for the bug (it's actually possible to write a compiler that doesn't have this problem), but it's very hard to implement and slows down the compiler - and so, like the 'export' keyword for C++ templates, most compilers do not implement it.

    Unlike the C comment problem, this case is not ambiguous - there is precisely one valid way to parse any such statement. It's just that the major compilers can't see it, because they lex the language using a giant regular expression, instead of something more context-sensitive.

    It's not really uncommon for a C++ compiler to reject all kinds of valid code. Compilers are usually full of many subtle bugs, which don't appear until you start to make them really work hard. This bug's just unusual in that it's been around for so long and in so many compilers.



  • [quote user="AssimilatedByBorg"]

    [quote user="m0ffx"]I always thought C wasn't supposed to care about whitespace...but I guess this shows I'm wrong.
    [/quote]

    The C compiler doesn't.  That wack-o C pre-processor, on the other hand....

    [/quote]

    Both the C compiler and preprocessor have the same behaviour. They do not care about whitespace between tokens. They care about whitespace inside a token. A token is any keyword, operator, { } ( ) ;, quoted character, number, quoted string, or comment (most of which cannot contain any whitespace at all). What you are seeing here is a quirk in the definition of what the tokens are - specifically, in case of ambiguity, the longest possible token is chosen at each point, beginning at the start of the file and working forwards. That is not necessarily the token which is implied by the context.

    The C preprocessor seems really weird precisely because it duplicates this same behaviour - it lexes the source into C tokens (but does not attempt to parse these tokens). While this makes a certain amount of sense on one level, it's extremely counter-intuitive behaviour, because you don't normally think about C source in those terms and keep expecting the preprocessor to operate on the raw text instead. Most cases of apparently strange macro behaviour can be traced back to this. 



  • [quote user="Carnildo"][quote user="Irrelevant"][quote user="m0ffx"]I always thought C wasn't supposed to care about whitespace...but I guess this shows I'm wrong.[/quote]
    intmain(intargc,char**argv){return0;}//bin/true/



    Good luck finding a language that doesn't care about whitespace at all (turing tarpits notwithstanding). I think most use whitespace to separate tokens in one way or another, it just happens that C has /* as a token.[/quote]


    Perl.

    [/quote]

    Perl's very close - Larry hates whitespace sensitivity, as do most Perl programmers that count - but it does have a handful of cases in which whitespace matters. For example, these are two different things:

    q#foo#
    q #foo#
    

    There's a few more, but I can't remember them off the top of my head - they're mostly rather obscure.



  • [quote user="KattMan"]

    Hence why white space is important in your code.

    If the following comment were not there you would get either 9 or a compiler error as there woud be no terminating semicolon.

    This should more properly be written as:

    int c = b / *pa

    Note the space between the / and the *, this is no longer a comment.

    [/quote]

    Parenthesis paranoia also helps:

    int c = b / (*pa);
    

    Alternatively (but not quite as obvious):

    int c = b / pa[0]; // Yay pointer/array equivalence!
    


  • Completely unnecessary whitespace sensitivity can of course be found in E4X, Javascript/ECMAscript with "native XML support".

    ECMAscript had a pretty nice syntax, similar to C++. Most productions make either use of few short textual keywords or special chars (brackets), leaving you in 99% of the cases the freedom to insert or leave whitespaces where you want.

    Now enter E4X with his statement to set a default XML namespace. For some reason all of a sudden this should be verbose and "self-explanatory", contrary to the whole rest of the language. So the statement is: "default XML namespace = <expression>". Yup, that's right, "default XML namespace =" is ONE token...



  • Speaking of C++ templates ...

    can someone explain something to me?  Is the STL binary-compatible?

    The STL is SOURCE-compatible, which means that I can buy the source code for your product, compile it with the same STL implementation I use, and link it with my code.

    But what if you provide your product only as compiled object code and it uses STL objects as parameters?  If I use a different implementation of the STL, things won't necessarily work, will they?

     



  • [quote user="newfweiler"]

    can someone explain something to me?  Is the STL binary-compatible?

    The STL is SOURCE-compatible, which means that I can buy the source code for your product, compile it with the same STL implementation I use, and link it with my code.

    But what if you provide your product only as compiled object code and it uses STL objects as parameters?  If I use a different implementation of the STL, things won't necessarily work, will they?

     

    [/quote]

    I think for the most part any two different implementations of the C/C++ library (let alone STL) will never be binary-compatible. Keeping two different library implementations binary compatible means that at least one must know the innards of the other, and keep up-to-date with it. How many people would really be willing to do this, when it's so much easier to just recompile?



  • [quote user="newfweiler"]

    can someone explain something to me?  Is the STL binary-compatible?

    The STL is SOURCE-compatible, which means that I can buy the source code for your product, compile it with the same STL implementation I use, and link it with my code.

    But what if you provide your product only as compiled object code and it uses STL objects as parameters?  If I use a different implementation of the STL, things won't necessarily work, will they?

    [/quote]

    It depends.

    The C++ language spec makes no provision for this at all - regardless of whether it's the standard library or your own code. However, there's another spec, commonly known as the C++ multivendor ABI, which provides for binary compatibility between compiled C++ code. In theory, two platforms which both implement the multivendor ABI correctly should be binary-compatible, even with STL objects as parameters.

    In practice, no two such platforms exist yet. It is questionable as to whether they ever will - this isn't the primary objective of the multivendor ABI, it's just something that's supposed to happen as a result of it.



Log in to reply