Fighting a Cyber-Attack Over Sneakernet During a Data Center Migration
What were you doing this week back in 2003? I don’t even have to think about it. I’ll never forget that I was doing exactly two things: Executing a major enterprise Data Center migration and fighting off the worst cyber-attack the world had ever seen: SQL Slammer. Why have one crisis, when you can have two simultaneously, right?
The Data Center Migration
My company needed a new Data Center. After all, I single-handedly blew up the old one eight months prior. The Data Center migration was done before the era of multiple Data Centers and Disaster Recovery, so we really just had one Data Center and had to move everything to the new one. This was also before the days of sophisticated replication technologies, so we couldn’t do it by wire, we did it by truck. This meant all of IT (plus a whole lot of vendors) took on the risky endeavor of physically moving critical infrastructure from one Data Center to another and making everything work on the other end before the start of business on Monday morning. As I recall, we did this over three consecutive weekends.
During the weekends the IT staff and vendor partners were divided up into shifts. We worked 16 hours, slept 8, then worked another 16. This was the schedule we kept from Friday evening to Monday morning for each weekend of the Data Center migration.
The Cyber-Attack
The Data Center migration had been going pretty well that weekend. I just arrived home from my 16-hour shift when my boss called asking me to come back into the office. He said, “We just got hit by a major virus. Everything is down, both the old Data Center and the new.” I really couldn’t believe what I was hearing. While Denial of Service attacks are commonplace nowadays, this was really the first large-scale attack of its kind. I never heard of such a thing.
Prior to that, my only experience with virus outbreaks was the ILOVEYOU virus and similar derivatives from 2000. I was working on a Help Desk back then. I had to field calls from confused users and clean up their overflowing mailboxes from the email onslaught that resulted from those that opened up the attachments and executed the malicious code. That’s how viruses worked back then. They mainly impacted user PCs and were spread through email attachments and Word macros. I remember a few others from back then like Melissa, Code Red, and Nimda. They all seemed to have a similar impact and were fairly easy to clean up with antivirus software.
SQL Slammer was different. Infected machines were servers running Microsoft SQL Server in the Data Centers, not desktops. Infected machines generated so much network traffic that they had to be disconnected (either physically, or by shutting down the network port on the switch) to keep them from killing the entire network. That meant remediation had to be done offline. Low on sleep, but high on adrenaline, my co-workers and I had the job of installing the security patch MS02-039 by sneakernet on hundreds of servers.
Through teamwork, skill, and determination, we finished the Data Center migration and cleaned up the mission-critical systems so they could all be brought online before the start of business on Monday morning (which was late Sunday night due to our presence in the UK).
Lessons Learned
As I recall, this event is what really kicked off the discipline of Security Vulnerability Patch Management in most IT shops. MS02-039 was released by Microsoft six months before SQL Slammer hit. Just about no one, including Microsoft, was patched for it. Nine months later, Microsoft instituted its famous Patch Tuesday, which continues to this day. If anyone has ever pushed back on your patch management process, remember SQL Slammer. It could have been easily avoided.
Cybersecurity is a big deal. Viruses used to be just annoying. In 2003 they started causing real damage. SQL Slammer took out Bank of America’s ATMs, Washington State’s 911 emergency response, and a nuclear plant in Ohio. Of course, since then, it’s only gotten worse. The biggest example to date is Stuxnet, the jointly developed American-Israeli cyberweapon that crossed air gaps to cause kinetic damage to Iranian nuclear centrifuges. If you are interested in learning more, check out the documentary, Zero Days.
Lastly, the biggest leadership lesson in all of this is to be prepared for the unexpected. We were doing a huge Data Center migration with tons of risk. It was planned with contingency and sleep time. That ended up being absolutely necessary to fight the cyber-attack. What if no contingency had been planned? This would have been a total catastrophe. While we never fought a cyber-attack like this one before, we were equipped to mobilize into action quickly. Team readiness needs to be baked in ahead of time. We had it, even though we had no idea if or when we’d need to exercise it.
Any cyber-warfare stories of your own? Any Data Center migrations go pear-shaped on you? Please share your stories in the comments below.