Help Bites--necro edition



  • So I have a small question that doesn't deserve its own complete thread and the Help Bites thread has gone the way of all the earth. So I'm restarting it afresh.

    Here's the question:

    What's the best way of stuffing simple tabular data into a CSV and sending it to a browser for download using PHP? Can I just trust the built-in function (fputcsv)?


  • Notification Spam Recipient


  • Discourse touched me in a no-no place

    @benjamin-hall said in Help Bites--necro edition:

    CSV

    browser

    What could possibly go wrong? (Answer: Excel. That's what will be the bane of your life if you do this.)

    PHP

    Run for the hills!


  • Fake News

    @dkf said in Help Bites--necro edition:

    PHP

    Run for the hills!

    Meh. I remember phpExcel as being reasonably easy to use and has the download functionality built-in while using native spreadsheets and not anything that could be misinterpreted.

    Seems to have been succeeded by PhpSpreadsheet.



  • @dkf said in Help Bites--necro edition:

    What could possibly go wrong? (Answer: Excel. That's what will be the bane of your life if you do this.)

    This.

    If you happen to have the fortune of inhabiting a parallel universe where users don't automatically load csv files into Excel then you may be fine with fputcsv

    Even then, if there's a risk that fields may contain linefeeds then it may be safer to pre-process the data to remove them (replace with spaces or something). The csv spec. does support line-breaks within fields but too many csv-consumers baulk at them. I would also be inclined to strip nulls and all other control codes.

    If you need Excel compatibility then watch out for fields beginning with '='. Whether quoted or not, Excel will try to interpret the contents as a formula. Excel will also mangle anything that looks even vaguely like a date.

    In general, if you need to support Excel then avoid CSV like the plague and use phpExcel or PhpSpreadsheet to output in xlsx format. That will give you scope to circumvent many of the gotchas.

    EDIT PHP's fputcsv and fgetcsv escape_char parameter worries me. The concept of a csv escape character is fundamentally wrong - csv uses quoted strings and repeats "" to escape quote characters. fgetcsv is definitely broken (a bug that php devs refuse to fix) and I wonder if fputcsv has similar issues.



  • Here's the thing. First, this is download only. It's basically a bulk reporting tool--the data lives elsewhere and I'm just putting it into a form I can slice and dice offline. I control the whole process start to finish.

    A record consists of the following:

    • A string that I control (an assignment title). No line-breaks or anything funny except apostrophes.
    • An email address. All of these are guaranteed to be of the form $s.$s@<fixed domain>, where the $s parts are alphabetic strings. The input code will reject anything else.
    • A time-stamp (generated by the database) as a string.
    • A float between 0.0 and 100.0 (% completed)

    And that's it. I'd rather not have to install any extra libraries onto this server, as it's shared hosting.



  • @benjamin-hall If there are no double quotes in your entire data set, you can use fputcsv safely.


  • Discourse touched me in a no-no place

    @benjamin-hall said in Help Bites--necro edition:

    No line-breaks or anything funny except apostrophes.

    If every field has no newlines, commas, tabs, double quotes, leading equalities or anything outside the printable subset of ASCII (plus space), you'll be fine. You'll avoid the multitude of bugs, as they're all in those (honking great) edge cases.

    Back when I was doing a lot more with this sort of thing, we had two key problems: we had to deal with names of people and names of things (specifically titles of academic papers published in non-English journals and the names of their authors, some of whom were on the same project and sticklers for having their name right). All the non-ASCII characters were totally the bane of our lives. Also, the data in the database itself was corrupted, with encodings varying between rows depending on what system that particular data entry grunt had been using. (There was no record of what the encoding was, of course, and Ruby had at that point some egregious bugs with string handling that exacerbated the problems.) It felt very much like torturing the strings into submission, and then torturing them again to try to make Excel not choke on the result. Bleah.


  • Fake News

    @benjamin-hall said in Help Bites--necro edition:

    And that's it. I'd rather not have to install any extra libraries onto this server, as it's shared hosting.

    That's an attitude I don't quite get; it's PHP. You can make a lib folder anywhere and put the libraries there. You can have a local PEAR installation and mass-download libraries. It's not like you're relying on a PECL extension or anything (and even that can be dealt with); libraries are just more PHP files.

    Of course if you don't want the complexity/overhead, I get that. But I tend to wire in libraries early because I figure I'll need them eventually. (I don't have a need for this functionality now but will quickly when scope creeps -- plus I can take a little extra time, use some formatting options, and get "bonus points" for the resulting download looking really slick.)


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.