What the Daily WTF?

rmr

I get emails once a month from a certain discount hotel reservations website that begins:

Dear, NULL

rmr

Programmers who haven't read this before really ought to: http://steve.yegge.googlepages.com/being-the-averagest

Its long, but worth it. Key quote:

Last year it finally dawned on me, after 16 or 17 years of this, that I just might possibly still be clueless about something important that I really ought to know, something that would make me a much better programmer. . .

So now I have no idea how far along the programmer-proficiency curve I am, but I can at least see that I'm nowhere near the high end; I'm not even to the halfway point. My ego still assures me I'm past the 25% mark, but realistically, I doubt it. I'm probably flush with the y-axis.

rmr

It has been a long time since I've written anything in Haskell, but could you do something like this:

howManyEqual :: Int -> Int -> Int -> Int

howManyEqual a a a = 3

howManyEqual a a b = 2

howManyEqual a b b = 2

howManyEqual a b a = 2

howManyEqual a b c = 0

rmr

Not only is this completely normal, but I think it is the most important part of our jobs.

We just need to face it, we will always have changing requirements. (I've been working on a project for about 5 months which is now in beta. Last week, my boss's boss came to us and wanted make a fundamental change to the underlying data model that would have set the project back by at least a month. Luckily we talked him into putting it off until version 2). Learning the mental stamina and coding skill to deal with this aspect of software development accounts for 90% of the difficulty of creating software in the first place.

It might not necessarily apply in this case, but for any programmers who are afraid of making changes to working software, check out the book Refactoring by Martin Fowler. It is pretty ubiquitous - it might be lying around your office somewhere. Really read (don't skim) the first two chapters. That book has had a larger effect on the way I program day to day than any other programming resource.

rmr

Doh, asuffield to the rescue again. Yeah, go with wget.

rmr

This doesn't seem too bad to me, I think you should stick around and go for it. Do you have to use PHP? It seems like this would be better handled by Python or Ruby.

In Python you can do something like:

urlopen('somewebpage.html').read()

to read a web page. It looks like that gives you a file handle, so I think you could use it to save off the .pdf too.

See: http://docs.python.org/lib/module-urllib.html for more on urlopen.

The program would look like a standard web crawler - it would start with your seed urls and then follow every link until you find the .pdf's that you need. Of course, if you have no idea on how to do this, you should tell the company that, so they have realistic expectation about how long it will take you, but I think you should stick with it. It might even be fun!

rmr

I've been reading The Mythical Man-Month, and I just came across the following passage:

"I have long enjoyed asking candidate programmers, 'Where is next November?' . . . The really good programmers have strong spatial senses; they usually have geometric models of time; and they quite often understand the first question without elaboration."

rmr

I'm perfectly willing to believe that I am completely misguided here. So lets assume he takes your advice and moves this data to a different format, say an object database. Where is the payoff? What would the simpler query look like? You certainly have enough of his schema information to show that.

I press this not because I'm some relational database evangelist, but that I think the essential difficulties of this query will remain pretty much regardless of the way you store the data. If I am incorrect about that I would very much like to know, as it could certainly make my day to day work easier.

rmr

Re. assufield:

If by clumsy you mean slow, you can always add an index. If you mean something else, then I simply disagree with you. This type of operation is going to be easier with a DB than with flat files or spreadsheets or whatever other method you come up with.

rmr

Well, you're right that people in the IT industry seem to be particularly prone to religious debates over some pretty pointless material. However, I think that this perception is mostly magnified by the anonymity and hasty posting that the web encourage.

Other than that, I don't quite see what the point is. Lower quality software may have some short term payoffs, but only at the expense of pretty severe long term pain.

rmr

I'm not sure I follow. One of the advantages of a database is precisely that it allows you to do "complex queries on multiple fields that are not the key". What do you suggest he change?

There are more normalized ways to store the data, but they don't make the query any easier. Or am I missing something?

rmr

Well, every software developer I know uses UML, so I guess it depends on who you associate with.

rmr

So how much were you able to simplify it?

rmr

I'm not sure I understand the problem, but a lookup table might be help simplify the logic: http://en.wikipedia.org/wiki/Lookup_table. You can replace the nested ifs with a lookup operation.

rmr

@rmr

Best posts made by rmr

Latest posts made by rmr