@dkf said:Actually, the main reason for avoiding simple shell scripts at the large scale is that they don't log enough, which makes chasing down problems (usually due to broken input, which is highly likely on large datasets) really hard.
There are a lot of reasons not to use shell scripts for this. None of the features you'd want with a real program (data validation, logging, debugging) are easy. It's stupid. Don't use shell scripts for anything more complicated than "I need to do a very simple check or manipulation on a text file and maybe launch an executable."
@dkf said:Using multiple processes is a very good thing though, as you farm those out across a decent-sized cluster.
No, I don't think you understand. You could do the whole conversion in a single process, and then run multiple processes in parallel. That's fine. What I was responding to was doing every step of the conversion in its own process and having them communicate via temp file/shared memory/socket/etc. so that one part of the conversion doesn't interfere with the other parts.
@dkf said:I can see why the original poster didn't like the system, but the evidence he's produced is actually sub-WTF (other than that they seem to have slapped a GUI on a shell script without thinking about what else they could do).
I consider this a WTF.
@dkf said:It could be a WTF if they are using badly written scripts, but that would just be a “bad program written by bad programmer” run-of-the-mill one.
That's 40% of the WTFs here.