*sigh* I guess if that's what you want...
-
Good day all. I had to dust off my old account to bring you this.
I just got a request from a manager asking if the following was possible:
Take an excel file.
Pull some data out and throw it into an xml
pull some data out and throw it into a ~*~ delimited file. Yes, that's the delimiter. No, you can't ask why.I told him, sure, that's fine. Then he said something I'll never forget.
"The XML format will be very flat and just a bunch of tags at the same level with CSV data."
From there, I believe they're going to pull the data into an Oracle table, presumably with CSV data in fields.
I said probably the nerdiest thing I'll say for a while. "None of this is technically infeasible, but it's an incredible abuse of both XML and CSV."
The reason for all this BS is that we're working with a huge vendor product, which means all bets that anything will make sense are out the window.
-
You need to throw some JSONx in there for good measure.
-
abuse of both XML and CSV
And an abuse of Oracle tables. But I'm sure at least Oracle did something to deserve it.
-
~*~ delimited file = "we're going to have commas and other free-form data but hopefully we won't have this" format.
Makes me wonder how many 15yr old girls are responsible for destroying data loads from comments sections..... xoxox~*~hugz!1!!11
-
"None of this is technically infeasible, but it's an incredible abuse of both [ ] and [ ]."
I think I say this exact thing in 75% of the meetings I have to attend.
-
Oracle is an abuse of
Oracletablesanything they touch.
Not empty any more :P
-
~*~ delimited file = "we're going to have commas and other free-form data but hopefully we won't have this" format.
I've posted before about the files in which the delimiter is "SPLITCHAR." This WTF goes beyond the obvious stupid delimiter.
These files are intermediate, temporary files in a process that starts with an Excel spreadsheet. Which could, you know, save the data in a standard CSV format with proper escaping of problem characters.
The spreadsheet is read by a Perl script (no, Perl is not TRWTF, not this time, anyway) using the CPAN library intended for that purpose. So it has the spreadsheet data in memory, in a nice structure representing the document, the individual sheets, the rows, and the cells in the rows. So far, so good.
The script writes the data to "CSV" files, one for each sheet in the document, then discards the in-memory data.
It then reads each "CSV" file and splits the data into fields. To get the data it just had. And discarded.
At least it no longer reopens the "CSV" file for each cell and each delimiter it writes. It used to take an hour and a half to process a large spreadsheet; eliminating the unnecessary file opens and closes cut that down to 30 seconds, or something like that.
-
-
~*~ delimited file = "we're going to have commas and other free-form data but hopefully we won't have this" format.
Some people "solve" that problem with characters not in the set of "printable ascii characters". There's even some specifically for that: [FGRU]S.
-
It then reads each "CSV" file and splits the data into fields. To get the data it just had. And discarded.
Fixing that bullshit is great for your reputation as a miracle worker.
"Frost, can you fix this program that takes half an hour to run?"
Frost looks at program, sees it's repeatedly opening, writing one line to, and closing seven different text files, all interleaved, and rewrites it to batch up everything it needs to do, then open each file one at a time, write out what that file needs in one go, and moves on to the next (after closing the file[1]), and made the program take like 4 minutes.
Then the guy who asked me to do it just said "meh, I'll just move the job to a faster server". Ok, dumbass.
-
Fixing that bullshit is great for your reputation as a miracle worker.
I fixed the unnecessary open problem (it wasn't even closing them, most of the time, just opening a file it already had open), but not the stupid "CSV" file problem. Fixing that would have required significant structural changes, and that definitely falls on the "don't bother" side of the cost/benefit trade-off. For a program that gets used, maybe, once a month (by a given project), it's plenty fast enough, and the stupid doesn't cause any problems except a for the maintainer (and anyone who looks at the intermediate files, but nobody really needs to do that; if the program cleaned up its temp files , nobody would even know).
-
Yeah, you always gotta consider ROI. The example I gave was something that ran at least daily, so saving close to half an hour per run was totally worth it.
-
-
-
-
which is then ROT+13 encoded for security?
-
-
-
-
-
JASONx
I wanted to make a joke but it transpired that
http://beyondthemarquee.com/wp-content/uploads/2012/07/jason-x10.png
Made it already.
-
-
Of course!
-
-
-
-
why?
I think I'm still a moderator on the forums. Once I figure it out, I'm gonna ban you, then find your mom's account and ban her, then pre-emptively ban all your current and future children.Have a nice day.
But to be honest, it's probably for the reasons mentioned in replies above.
-
-
Doesn't look like it.
Is this the point in the thread at which we have a conversation about facts, jokes and barriers?
-
Some people "solve" that problem with characters not in the set of "printable ascii characters". There's even some specifically for that: [FGRU]S.
But I can read all of those characters...
-
How do you know?
I bet you can't read , , and .
-
I recognise the first two from CSV, but I've never seen period separated values before.
-