@mallard said:
I suppose the commission info wasn't in predicatable locations?
If it was, I would covert to a bitmap and draw white boxes over the "secret" data...
The statement was formatted such that there were a series of transactions for a given day, with the commission under each. The last transaction for the day had a daily-total. Then there was some space, and another day's transactions. At the bottom of the month, there was a daily subtotal, and a monthly total. Depending upon how many transactions the person did, the line(s) with the commission-data moved vertically on the page. You had to first figure out the transaction count in order to calculate where the subtotals might be. Add to that pagination with headers, and footers that could vary in height (footnotes) and it was extremely difficult to do.
The pages had a watermark logo, so you couldn't just draw over them (of course, you would have no way of knowing that).
Also, 2 more WTFs... when they converted the original data to wmf, they must have scanned the bitmaps and done OCR, as there were infrequent, random period, command, dash, [reverse] single quote and asterisk characters in what a human would see as open-space areas on the page. I could only surmise that there were imperfections in the paper of the original printed document, and these scanned/ocr'd as characters. I had to find and delete them, making sure not to accidentally delete any financial information.
The other, more interesting wtf, was that the ocr didn't put all the characters on the same line at the same y-coordinate (I'm not talking about descenders here), nor did it place monospace characters at even x-ccordinate intervals. There was a plus/minus 5 pixel vertical (random) variation, and a plus/minus 7 pixel horizontal (random) variation. Considering that it was a monospace font, it drove me nuts, as you had to do a mathematical 'squint' to see if something was next to something else. I affectionately named it the jiggle-factor.