[quote user="viraptor"]
Hey
After someone wrote about ":(){ :|:& };:" and "rm -rf" on main post comments, I'd like to start this thread. What did you do yourself / cleanup after others?
[/quote]Let's see...
- 'kill <pid of ssh>' while working
remote (looked at wrong one in the ps tree),
Check (at least a dozen distinct ways, including killing, crashing, shutting down, hanging up, power-cycling, changing the encryption keys for, or otherwise rendering useless a wide variety of access points, firewalls, gateway servers, routers, UPS power controls, modems, VPN daemons, log daemons, Kerberos masters, watchdog heartbeat processes, and other things on the critical path between me and the root shell on a live system...).
The most spectacular was the array of four servers and four UPSes. Not only did the people who set up the room use the wrong monitoring signal cable, but they connected the power cables to one UPS and the monitoring cables to another, in a cycle. When I installed the UPS monitoring software on one of the machines, because it was using the wrong cable it immediately powered off the machine next to it...and as that machine died, it sent a signal to the UPS it was monitoring which powered off the machine next to it...and the next...and finally the first one. 20 seconds later they all powered back on, but the machine I was testing with had the monitoring software configured to start at boot, and as soon as it came up it tried to monitor again...click...click...click...click...it took another two cycles before I realized what was going on, and another for me to stop laughing long enough to fix it...
- 'shutdown'-ing / 'reboot'-ing while on remote shell by
accident, thinking it was my box
Check. I never make that mistake on a test or dev machine, always a production server or some unsuspecting user's desktop.
- 'iptables -F'-lushing while INPUT was DROP by default
Check (and OUTPUT and even FORWARD once).
- mixing up partitions in software raid (noticed "rebuilding" too late)
I've tried to do RAID5 catastrophic recovery (that's where more than one disk failed and you're now just trying to pick up the pieces) and done that once. 690GB of slightly damaged but still mostly-recoverable data was irretrievably lost in a matter of seconds. Fortunately there were backups, although at the time 690GB of data represented 78 continuous hours of tape access...
- 'tar -zcf *' instead of 'tar -zcf archive '
when the first 'aaaaa' file is the most crucial one
I've avoided that one, but I did once mail myself my 600KB home directory as a series of uuencoded tarballs, one for each directory in the directory tree. Unfortunately I didn't realize that tar includes all subdirectories by default, so there were dozens or hundreds of copies of the lower-level subdirectories, a total of 60MB accumulated in the outgoing mail queue. In 1992, individual message size limits were on the order of 100KB and mailbox total size limits on the order of 1MB (which is why I had to split up the files in the first place). At the time 60MB of mail could--and did--take out a large university's campus mail server.
- not noticing >> instead of << when running a command from history
I once had a cat walk across my keyboard while I was in another room. She held-down the up-arrow for a while, then hit Enter on a machine that had open root shells on remote servers. Very bad things happened. I now have every computer in my house automatically lock down after five minutes of inactivity, and I now take care to phrase commands in a way that makes them harmless when entered out of context (e.g. always "rm -rf /full/path/" instead of "cd /full/path" followed by "rm -rf *").
- setting too aggressive brute-force logins blocking rules and missing password 3 times in a row next time when connecting (ip block for a day)
I set up one of those but I insisted on implementing a white list first so I wasn't locked out; however, the very first entry in the blocking logs was me, and I was
not testing the white list at the time...
[quote user="viraptor"]I cleaned up after others, that were hit by:[/quote]
- forcing bash update before libc package (lame update scripts on slackware - beware updating by more than 1 version - swaret will not warn you)
I do a lot of work with embedded custom boot images of various kinds (from firewall-on-a-bootable-CD to encrypted-root-filesystem-on-a-laptop), so things like forgetting a critical shared library in /lib (or more often upgrading the binaries in /lib and finding they need new shared libs) are a daily occurrence. Also, a surprising number of disk read errors occur in the critical path for things like /usr/lib and /lib when disks start to go bad. And if you're on the experimental branches of Debian or Gentoo, this kind of thing happens once or twice a year...it's part of the risk of doing front-line QA on these distros.
I have done a lot of moving around filesystems on live servers. Each server process needs to be hot backuped, shut down, recovered, and restarted on the new filesystem. It's a complex process with a lot of critical steps, one of which is often to kill and restart all of the processes running on the machine except for two: the login shell, and the ssh daemon that is hosting it. It turns out that's surprisingly hard to get right (wait, don't kill the screen daemon either! D'oh...).
Another step involves atomically changing the root filesystem for all processes on the machine with pivot_root. Now imagine that you do these steps on a machine 6000 miles away, and then discover that the temporary root filesystem that you created to keep the machine running during the transition was just slightly too small and doesn't contain all the data it was supposed to--and that while libc cut off after the first megabyte is actually usable for a surprising number of programs, that number does not include chroot, pivot_root, mount, cp, ln, rsync, tar, cpio, cat, chmod, dd, reboot, shutdown, or the parts of bash I used in my chroot sanity check before I ran pivot_root...
I think I've personally discovered everything that can possibly go wrong with that process--the hard way.
- rm -rf /lib (I don't think we're in chroot anymore)
I once had a user running an IRC bot they had written. As a debugging statement, they had something that would take input from a channel and feed it to:
echo $@ >> logfile
Sooner or later somebody said "rm -rf *
" in an IRC channel, and I had to restore 17GB of that user's home directory data from backups.