So my company decides that they would like to display news and economic information on the company web site. Somebody in Marketing (I know, doomed from the start) found a vendor for each type of feed and *blam* the project was under way.
After meeting (via teleconference) with the vendors my team and I came up with some pretty standard design documents that modeled the process based off of the specifications given to us by Marketing and by example XML documents (that is the way that the vendors would provide information to us) provided as examples of the feed. Everything was pretty standard, only a few XML tags, and this seemed like a very simple task/assignment.
After about a day or so my team had managed to write the code that would read in the XML files and process them. Being as we use Java, XMLBeans , this was a very trivial task. Now, in order to receive actual data from the vendor we had to wait two weeks before they would be able to provide us with files to parse. WAIT UP!! So, you sold us a product that isn't finished? These vendors told us time and time again how "our other customers love the feeds, they have no problems parsing the XML files... yada yada". So, if your other customers "LOVE" the feeds: where are the feeds?
Never-the-less, three weeks later we finally start to get files. Now, we had agreed upon certain standards for indicating countries: i.e. United Kingdom (not UK, Brittian, Great Brittian ect.) and many other standards such as United States (not USA, or United States of America) in order to ensure that we were on the same page. Espically since both of us wanted to use the country names as a lookup table for referencing articles from the DB. What did we start getting? UK, US of A, Portuguals... ect. Some were just typo's and others were completely retarded and against the specs. This confusion took two weeks for them to clear up in their feed (that, remember, EVERYBODY loved...).
So the names are agreed upon and we are back in business parsing XML when all of a sudden we get complete crap in parts of the files. Some files had junk data and we could not figure out why. Finally I decided to see if it was an encoding issue. I changed the XML header from ISO to UTF-8 and *blam* the crap data started showing up as valid 1/2 symbols and many other neat characters. I then inform the vendor of this slight error in their feed to which they responded that it would take them a week to make sure that things were encoded in ISO instead of UTF-8. All they had to do was change the XML header, in retrospect maybe I should have just changed it myself....
A week later and we are getting ISO headers and ISO formatted XML, and everybody was happy, and there was much rejoicing! Wait... the Euro symbol and many others are not showing up correctly. Could it be that these characters are only readable in UTF-8 and not in ISO? Yes, it appears so. Another week goes by while they format things into UTF-8 and give the files a UTF-8 header. (hey, didn't I mention just changing the header a month ago?????)
Well, we finally get our nifty UTF-8 files but now they won't process!!! We are getting a neat little CDATA error. Now, keep in mind the guys on my team are Java programmers. Not XML experts, not web programmers. Both of us are junior programmers (I myself just graduating in December). A google search finally found the problem: they are now encoding these files for Windows, which adds a character to the begining of the file! YAY! Is it too much to ask for a vendor that knows what they are doing?
This whole project has had me just wondering: WTF????
New to this site and I LOVE it... I hope my story didn't bore you too much, sorry if I am not that great of a story teller :)