It was 1999, and our new online marketing venture was finally off the ground and making a profit using an off-the-shelf conglomeration of bits and pieces of various content management, affiliate program, and ad servers. We'd hit all
of the goals for our first funding tranche, and the next step was to use those millions of dollars to grow the staff from 12 to 50, half of which were software developers working directly for me.
The project was an $8 million, nine-month development effort to build, from the ground up, the best 21st-century marketing/e-commerce/community/ad network/reporting system mousetrap possible. Leading a team of 20 people was a big
step up, so I buckled down, reading management theory books, re-reading The Mythical Man month, learning the ins and outs of MS Project, Rational Rose, and Requisite Pro, investing in UML and process training, and carefully poring over
resumes to find the best candidates.
Having assembled, trained, and indoctrinated my team in current best practice formal software development process, we went to work. We held stakeholder interviews, pored over requirements, developed use case models, charted process
flow, designed domain entity models, built our development plan. We developed cleanly separated business logic, persistence, and user experience tiers. We followed formal test-driven development. We held weekly group code reviews. And
slowly but surely we carefully moved forward with development.
No, the WTF is not this overly formalized, non-Agile, upfront design, big architecture, strictly controlled waterfall development model. This was what was required to ensured that we succeeded, and helped us to hit all of the
milestones along the way towards our goal. We of course had our share of back-and-forth with business goals vs. quality vs. scalability vs. time constraints, but overall I have never before or since been on a project which ran as
smoothly as this one.
Until the last week before release.
One of the key components was our custom-built ad server, which used a single unique ID for each ad placement, handling switching of creatives all on the backend. At the time this was still uncommon, with many affiliate programs and
ad servers hardcoding the creative images themselves directly into the HTML ad serving code. Being able to manage updates of creatives, optimize banner rotation, and take down unwanted ads on the back end was one of the major advantages
as seen by the business users.
"Hey Brian, quick question." It was Barry, VP of Marketing. The entire company was his brainchild, and he was CEO in all but name. "Some clients are complaining that these IDs are kind of ugly, always
'F0DB57A3C10EE7D28277
' or some other unpronouncable jumble."
Now, these IDs were uniquely generated each time a new ad placement was created. With dozens of client and hundreds of affiliates signed up representing thousands of web sites and tens of thousands of pages, we needed to generate
them automatically. To avoid possible fraud, they needed to be non-sequential but unique. And they were purely used on the backend--nobody ever needed to read them out loud, so far as I could see.
"These are only shown to users as URL parameters, no different than the session ID," I protested. More diplomatically, I asked, "What's the reason to for making them easily readable out loud?"
"Well," he admitted, "one of our biggest advertisers has legacy accounting systems which their IS department can't or won't integrate with our online reports. When talking by phone with people in different offices, they have to read
the IDs to each other to be able to identify which accounts they are talking about."
After thinking a moment, I realized that this was the perfect place to apply an algorithm I had learned about recently. "Markov chains!" I blurted. "We can use statistical textual analysis to generate unique random words built up from natural phonemic combinations. They won't be real words, but they will match expected English patterns, and people will be able to pronounce and read them completely naturally."
Intruiged, Barry assented, reminding me that the release at the end of the week still had to be met.
But I was already off, thinking through the design in my head. I grabbed my star developer Shipra, and over the next two days she and I built a corpus analyser to build the necessary statistical models, and the generator to randomly
string them together and output the "pronounceable IDs". It was a great success. Everyone crowded around to see the server spitting out fake words like "enspattle
", "flargleblum
", "unclorifical
",
and "macrodestic
".
Barry was ecstatic too, "This is great! That client has been threatening to drop us because reading off those codes is slowing down their operations. They're our most well-known anchor client, so if they go, others will drop with
them. This is exactly what I we need to keep them. Let's demo it right away."
The two of us drove up to their offices in the city, and Barry proudly told them that I, standing next to him, was the genius who came up with a way to make readable codes and increase their workers' productivity. He opened up a
browser to the demo word generator page and clicked "New random word".
"garglepussy
" immediately popped up on the screen.
After a silent five seconds while the client stared in horror, Barry said "Well, it is random after all. Brian, you can filter that, right?" "Sure, I'll put a bad-word list together," I said, groaning that I hadn't thought of it
before. We ran through it a few more times, getting nice normal-sounding words like "blutterful
", "trimbolid
", and "anavastic
". We finally left with the client happily and satisfied.
During the drive back, Barry said, "I've been thinking about it and it's too dangerous to just have a bad-word filter. We'll never be able to think up every possible offensive-sounding combination. Can you make them sound like a
foreign language instead of English? That way if it does come up with some curse words most people won't even realize it."
It was a good idea--we already had a corpus analyser, and could plug in just about any text we felt like. "Sure, I'll show you some samples this afternoon," I told him.
Shipra and I spent the next few hours running the corpus on just about any foreign text we could find. We plugged in "Lorem ipsum dolor" to get some fake Latin, she pasted in some of her personal emails transliterated from Hindi, the
German libretto of "Die Zauberflöte", some Balzac novels cribbed off of the French Gutenberg Project, the text from some Italian airplane manufacturer's web site, and "Don Quijote".
Barry stopped by and we tried the samples out on him, one by one.
Latin: "Everybody pronounces it wrong, differently."
Hindi: "Too many weird vowels, and it makes me want to slip into an Apu accent."
German: "All those consonants and throaty sounds are too hard."
French: "Are you kidding? Most of the letters at the end of words are silent."
Italian: "Better, but it look me two years to learn to say 'gnocchi' right."
Spanish: "Easy vowels, simple sounds, best yet! But some of their staff are Hispanic. Too dangerous."
We all sat there for a few minutes trying to think of something else when Barry cried out, "I've got it!" And ran out of the room. He came back with a Japanese study book. "I'm planning to expand overseas after this release, and
bought this book to study with since it writes everything in the English alphabet. It's perfect--there's simple vowels, only a few constonants, and no funny sounds to trip you up. Even if people pronounce a little differently it's still easy to figure out. And nobody knows what it means so it can't be offensive!"
The next day, one day before release, we had finished typing in page after page of meaningless Japanese, we were off once more to demo to the client. Barry carefully clicked "New random word."
"koremachiko
", "sabashimasu
", "tobetokaga
", "mitsukaremo
". The client carefully read each example, and after a few minutes leaned back, chuckling, "That's perfect, Barry. They're a
little funny, but can be read distinctly, and no chance of offense. Thanks for doing this for us, we're onboard for the launch."
Barry was ecstatic.
By the next morning, everyone in the office was in a great mood, too. Launch had gone smoothly, and our realtime reporting showed all of our website activity, ad serving, and commerce transactions ramping up. Everything was working
perfectly.
We were in the middle of a celebratory all-hands wine & cheese party in the conference room when Barry got a call and stepped out of the room. Several minutes later he came storming back into the room, waving an email he had printed
out and yelling, "Brian, Brian, you useless screwup! What the hell is wrong with you? How do you explain this? Read it, OUT LOUD!" He shoved the paper at me, and I took it, apprehensively.
"fukushita
", "moreshite
", "fukumiharuda
", "youfatsu
", "tokaduki
", and "fukyusuka
", I read, collapsing inwardly and visibly shaking as I read down the list,
imagining the customers' staff members reading them to each other all day. Some people in the room tittered, earning sharp looks from Barry.
"They dropped our contract!" Brian shrieked, "Half our revenue is gone! You've killed our company!"
I started to protest that dropping the bad-word filter and using Japanese were both his idea, but I could see that this would do nothing to abate his fury. The next day I asked for, and was happily granted a two week vacation which I
used to start looking for a new job.
Five years later at an industry mixer, I exchanged cards with a developer and saw that he worked in Barry's department at my old company. I told him I had worked there years before and was glad to see the company had recovered and
was going strong.
"You worked there too?" he asked, looking at my name tag. He suddenly got a strange look on his face. "Wait, are you THAT Brian, who first developed the business platform?"
Cautiously, I replied in the affirmative.
He broke out into a huge smile. "You're famous. You know, we're still using it. We call it The Automated Curse Generator."