Excel I/O


  • Garbage Person

    So my platform has a generic Excel import/export "feature". Scare quotes because it is terrible.

    Our import export infrastructure is based around programmatically generating SSIS packages. For Excel, we currently use the stock Excel ODBC/OLE connector thing that ships with SSIS, which uses ODBC/OLEDB in the background.

    This sucks, because that component is 32bit only and MS has no plans to modernize it. This means that we have to have special code that starts a 32bit process anytime we touch an Excel file.

    And it requires all sorts of ODBC incantations to bubble up to the API level, which confuses our downstream developers to no end.

    And it's a constant source of quirks and bugs.

    So, bottom line I'd like to replace the whole stupid thing with a custom SSIS component built on top of some sort of third party library that doesn't suck. Must be able to deal with multiple sheets. Legacy XLS and XLSX. A .Net focused library would be ideal.

    What options do I have?



  • With the limitation that you have to install Office on the server, the Office Interop Objects in .net are quite good.

    A previous company I worked for used them to build a PPT deck from a bunch of DB queries and a template, and it worked quite well.


  • Garbage Person

    The legacy iteration of this did in fact use Office interop. It was not exactly good.

    And then there's Microsofts opinion on the matter.

    Basically, "if it works, good. If not, fuck you."


  • FoxDev

    I've used GemBox.Spreadsheet in the past. It's pretty lightweight, doesn't require Office Interop or such shit, and supports XLS and XLSX. Not 100% sure if they have a 64-bit build, but it's .NET, so it might be an Any CPU build; that should work for 32- and 64-bit.

    They have a trial version, but that's limited to 150 rows IIRC.



  • @Weng said:

    The legacy iteration of this did in fact use Office interop. It was not exactly good.

    Well then write-out a CSV file that'll open in Excel by default. I'm not sure what else you're looking for, if you've already ruled-out the one and only sensible way of doing this.


  • Garbage Person

    Whiny dirtbag clients are the reason we support Excel at all (we support CSV separately).

    My suggestion that we not service whiney dirtbag clients who are incapable of using a real data interchange format has been repeatedly denied.

    Also, MS basically says in the above KB that the correct solution is either to use the mechanism we currently are, or to interact with the files directly, rather than using Interop.

    I would entertain XLSX only solution as well, because I stand a chance at being allowed to treat XLS as a third class citizen or deprecate it altogether.



  • If you used open formats you wouldn't have these problems... [/stallman]


  • FoxDev

    @Weng said:

    Whiny dirtbag clients are the reason we support Excel at all (we support CSV separately).

    GemBox.Spreadsheet will do all of those, and you only have to use one library and one piece of code ;)



  • @blakeyrat said:

    With the limitation that you have to install Office on the server, the Office Interop Objects in .net are quite good.

    Do the Office Interop Objects in .NET have some means of keeping Office from giving you a modal dialog wedgie? (I.e. the "Excel threw up its hands and popped up a dialog box, and now our process is stuck because the Excel dialog modal loop isn't listening to anything the automation classes send it" syndrome that, AFAIK, eventually dooms any attempt to automate Office in a server environment...)



  • Based on the link Wang posted above, apparently not.



  • Looking through the only project where I did excel import thing...

    	public class LinqToExcelProvider
    	{
    		/// <summary>
    		///     Template connectionstring for Excel connections
    		/// </summary>
    		//private const string ConnectionStringTemplate = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties=Excel 8.0;";
    		private const string ConnectionStringTemplate = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"Excel 12.0 Xml;HDR=YES\";";
    

    Sorry.



  • I use EPPlus. Much easier than Office Interop, but I'm pretty sure it only does XLSX...



  • I vote for EPPlus, too. I'd rather tell my users that they need to use a version of Excel less than 12 years old than support Excel on the server and deal with the licensing challenges associated with it.



  • @Weng said:

    Whiny dirtbag clients are the reason we support Excel at all (we support CSV separately).

    It's gonna take a while for any whiny dirtbag client to notice that if you write out a CSV file and name it something.xls, Excel just opens it without complaint.


  • Garbage Person

    Negative. Modern versions complain.

    Also, the exceptional irony here is that many of these are going into customer side import routines, but we aren't allowed to talk to their IT teams to hammer out a non idiotic data interchange format



  • Shitty bugger cock bum pubic hair!



  • @blakeyrat said:

    With the limitation that you have to install Office on the server, the Office Interop Objects in .net are quite good.

    From here:

    Besides the technical problems, you must also consider licensing issues. Current licensing guidelines prevent Office applications from being used on a server to service client requests, unless those clients themselves have licensed copies of Office. Using server-side Automation to provide Office functionality to unlicensed workstations is not covered by the End User License Agreement (EULA).
    If *they* don't have a license, *your* server is out of compliance.


  • Buy a second Office license. Install it on a second server. Make the clients talk only to the second server and it relay requests to the first one. Profit.



  • @anonymous234 said:

    Buy a second Office license. Install it on a second server. Make the clients talk only to the second server and it relay requests to the first one. Profit.

    Microsoft has that angle covered. From [here][1]:

    Multiple or pooled connections. You may not use hardware or software to multiplex or pool connections, or otherwise allow multiple users or multiple computers or devices to access or use the software indirectly through the licensed computer.
    [1]: http://products.office.com/en-us/microsoft-software-license-agreement


  • M$ wants their money - even if it means making an otherwise workable solution unworkable.

    Only chance here is a restricted server running Office and ensuring every customer that accesses your server has only been given credentials when they have proven to you that they legally own a license. Even then, you're still on the hook.

    *shivers*



  • Curses, foiled again!

    New attack plan: dazzle the enemy. Build an incredible machine of at least 10 computers, 5 with Office and 5 without. Make them exchange the Excel files at random, discard them, recreate them, encrypt them, send half the files to each other, send the file hash and compare them, ROT13 them, upload them to various cloud services, and build two robots to transfer all the data in and out of the server by playing a set of drums (8 of them, with 8 arms on the robot so you can play an entire byte at once). Their lawyers should quickly figure out that it would take far too long to explain the situation to the judge and drop the case.



  • You could always use PHPExcel...

    *ducks for cover*



  • @anonymous234 said:

    Their lawyers should quickly figure out that it would take far too long to explain the situation to the judge and drop the caseonce they've managed to explain the case to the judge, they can retire and live happily ever after.

    Also, this post can't be empty.



  • @redwizard said:

    Only chance here is a restricted server running Office and ensuring every customer that accesses your server has only been given credentials when they have proven to you that they legally own a license.

    Not quite good enough. The customer might be accessing your site from their company owned iPad. If a device is not personally-owned, and is not granted one of the five Office 365 entitlements, then it probably requires an additional license.

    Essentially, Office on the server makes it nearly impossible to maintain license compliance. The only sane way to do it is to make every customer sign a statement attesting that they are properly licensed for Office and that they accept responsibility for responding to software audits. I've had Microsoft's auditors accept that one for other products.



  • @Weng said:

    Negative. Modern versions complain.

    Write them out as an HTML table. That's gonna work...

    Anyway, we used https://npoi.codeplex.com/ for Excel import/export - it does .xls and .xlsx pretty fine, with the caveat that it's somewhat close to the metal - reading files in works great, writing in values works fine-ish, formatting might or might not end up a nightmare you'll have to abstract away



  • You could try Microsoft's Open XML SDK, I'm pretty sure it's independent of Office itself.

    I used it to read a specific Word document a while back and it worked fairly well, but I haven't really messed with the spreadsheet functionality.



  • It's a pain in the ass to work with, but it works.


  • FoxDev

    @Maciejasjmj said:

    Write them out as an HTML table. That's gonna work...

    fun fact. for a single worksheet solution simply saving a bare html <table> to disk and changing the extension to .xlsx will allow excel to open it just fine.

    excel will complain about the file being corrupt, but it'll open fine and when the user saves it it'll write out fine.

    yes i do abuse that for generating excel exports using nothing but client side javascript for one of our reporting apps at work.


  • FoxDev

    @accalia said:

    yes i do abuse that for generating excel exports using nothing but client side javascript for one of our reporting apps at work.

    I assume that's an internal reporting system?


  • FoxDev

    yes it is an internal reporting system.

    spits out lists of customers to market to based on query information, including links to the customer record in our CRM and fancy little links to have the salesperson's desk phone automatically call the contact at the other company. works a treat (and works on IE6+ (although the formatting is a little fucked in IE<8))

    there's no reason that it needs to be exported to excel, but the salespeople don't trust it and want their data in excel. so the export button saves a <table> tag in a file with a .xlsx extension so they can play with it in excel.


  • BINNED

    @accalia said:

    there's no reason that it needs to be exported to excel, but the salespeople don't trust itwant to fiddle with it and want their data in excel.

    FTFY


  • FoxDev

    @Onyx said:

    FTFY

    fair enough.

    of course the fiddling they want to do with it is all supported by the web interface too (including sorting into an arbitrary order for rasins via drag and drop of rows) i made damn sure that the interface could do anything they wanted it to, and that they not only had the training to use it but the documentation (written by an honest to goodness technical writer not a developer) to refer to later.

    i swear 9/10th of the functionality they insisted they absolutely needed or the project would be a failure has never been used after the first week it was deployed.


  • FoxDev

    @accalia said:

    including sorting into an arbitrary order for raisins

    I can see why people like to blame raisins for everything; they're just so demanding!



  • Fun fact: if you save it as .xls instead of .xlsx, it seems to suppress the warning.

    Another fun fact: if you use this method and get past the 1M row mark, the resulting half-gig file will cause Excel to work flawlessly half of the time, and shit itself in various violent ways the other half.

    Yet another fun fact: if you pull all this shit from an Oracle stored procedure, a squad of trained assassins will hunt you down until your rather untimely demise.


  • FoxDev

    @Maciejasjmj said:

    Fun fact: if you save it as .xls instead of .xlsx, it seems to suppress the warning.

    huh.

    /me makes a note to change that next time i get a chance

    @Maciejasjmj said:

    Another fun fact: if you use this method and get past the 1M row mark, the resulting half-gig file will cause Excel to work flawlessly half of the time, and shit itself in various violent ways the other half.

    not that there are many office applications that like muti-hundred meg files, but good to know.

    @Maciejasjmj said:

    Yet another fun fact: if you pull all this shit from an Oracle stored procedure, a squad of trained assassins will hunt you down until your rather untimely demise.

    if i did that in and DB, let alone oracle then you'll have to storm the insane assylum to get to me ;-) assuming i havent come to my senses and taken potassium cyanide.



  • @accalia said:

    me makes note to change it next time I get a chance

    Haven't tested it too thoroughly, so for all I know it might be another symptom of Excel's explosive diarrhoea.

    At least the DB sproc just builds a string. On the C# side, they do the same thing, except they built a fucking framework for it.


    Filed under: public static const string BeginTableRowTag = "<tr>"


  • FoxDev

    @Maciejasjmj said:

    On the C# side, they do the same thing, except they built a fucking framework for it.

    The stupid thing is there's plenty of libraries out there that will generate XLSX files; GemBox is one I've used before



  • We rewrote some of this thing into NPOI - mostly COM interop bits, since it was easier to sell to the client as useful. Proper refactoring would have to include:

    • dissecting notoriously hard to version Oracle packages
    • porting most of the business logic to the app
    • updating jQuery from 1.3
    • fixing a gajillion of bugs caused by the aforementioned update
    • turning IE's Motherfucking Modals* into sensible jQUI analogues
    • making the site work in anything that's not IE in compat mode with IE7
    • dissecting 3k-lines-long codebehind files
    • clearing up 500-lines-long dynamic SQL queries built by concatenating WHERE clauses taken from DB tables
    • finding all instances of showing message boxes by throwing unhandled exceptions
    • replacing the homegrown abomination of a gridview which poops itself when you look at it wrong with Datatables

    Among other things.


    *Fun fact: IE, at least in compat mode, doesn't just have some pansy modal dialogs. No, IE has Motherfucking Modals. You get a real window, not some fancy-schmancy dark background on page. You ain't right-clicking this thing. You ain't F12ing this thing. You ain't even closing the browser until you deal with the Motherfucking Modal, that's how cool it is.


  • ♿ (Parody)

    @accalia said:

    fun fact. for a single worksheet solution simply saving a bare html <table> to disk and changing the extension to .xlsx will allow excel to open it just fine.

    Yeah, our reporting software does this in order to serve up "Excel" reports. :wtf: Though I think it uses the old .xls extension.


  • FoxDev

    @boomzilla said:

    Yeah, our reporting software does this in order to serve up "Excel" reports.

    it's a nice trick to getting salesdroids their excel without dragging office interop or any of the third party workbook generators into the mix.

    won't work if you want formula fields and stuff but for raw data it works well.


  • ♿ (Parody)

    It's not so nice when group policy crap throws up warnings or whatever. I forget the details...it got fixed a couple of years ago and I never witnessed one first hand.

    It's also a pain because LibreOffice doesn't convert the same way. I'd be much happier with a genuine spreadsheet.


Log in to reply