The Sparky Incident
The Friday before Memorial Day, in 2002, I blew up the data center. It happened when I lit the PDUs for a new row of equipment that I had spent a week racking and cabling. A 110v power strip didn’t like the 220v I was giving it. The surge filled the room with a haze of smoke. The distinctive smell of fried capacitors and the sound of menacing alarms filled the room.
In an instant, the entire enterprise was down. I walked out of the data center and the sound of the door clicking behind me caused every head to pop out of the surrounding cube farm to see who had just walked out. We had a large monitoring screen in the Network Operations Center. It wasn’t red or blinking, it was just off. There was nothing.
I remember yelling, “Help!” Everything after that was a blur. The data center filled with every able-bodied engineer and technician around. In a matter of minutes, managers, directors, and VPs started filing onto the data center floor to see the mayhem first-hand.
We restored critical services in a matter of 3-4 hours, but cleaning up the collateral damage took weeks. I wasn’t fired for this. I honestly didn’t even fear getting fired for it. We had a culture that avoided the blame game. After the dust settled we put in some processes and procedures to look closer at our power equipment selections, and we also avoided powering on anything new during business hours.
This failure even helped us design a more resilient power system in the new data center that we built the following year. The only thing negative that happened to me was I earned the nickname “Sparky,” which took a while to shake. My management team understood that I was doing what I was empowered to do. It would have been easy to blame me, but they took the high road.
I don’t know how much the outage cost the company. It was probably a lot. I gained a valuable lesson on the type of leader I wanted to be. I wanted to pass on what had been modeled and pay it forward. I wanted to be the kind of leader that doesn’t play the blame game and doesn’t kill individuals for mistakes that were really caused by organizational process gaps.
This is the event that made “psychological safety” my native tongue. I expected this to be normal. Everywhere I’ve gone since I’ve looked for this culture. If it wasn’t already present, we made it so. It’s the only way to lead.
I hope your Memorial Day weekend is smooth. Try not to blow up the data center.
11 thoughts on “The Sparky Incident”
Odd that you were able to plug 120v equipment into a 240v(208v?) circuit, typically they are mechanically non-interchangeable…
I’m no expert in electrical stuff (obviously). I do recall there being a special adapter involved in making it connect.
I remember that day, Sparky!
Wow, there are many things packed into that story. First that it happened, the response to it happening and the the follow up. The focal point appears to be on the absence of the blame game. How does that really happen? I believe that it is part of culture that is built that takes time and a team effort. The out comes listed indicate that team worked to recover service and then improve. Congrats to you and everyone involved because I am sure there was much involved after the initial sparks flew.
Great insight Joel. I agree. Great culture is built over time and most importantly, is maintained with discipline and good character.
Great story and perfect example of the GMAC RFC culture, leadership and teamwork; swift action, collaboration, learn from it and move forward.
Thanks for the comment Paula! Yes, we had a good thing going at GMAC RFC. Everywhere I’ve been since, It’s been my goal to recreate that culture as far as it is up to me.
Great story! I have definitely had experiences like this with good managers in my career and it has left a similar impression. In the moment, how we got into the situation is secondary. Focus first on working together to get out of it, then look for root causes in the environment, not just people.
Really good stuff, Mr. Hughes. Nailed it about those kind of leadership qualities. It’s rare and doesn’t come naturally for many.
We had a solid larger team at GMAC RFC. Keep up the spirit!
I remember that day. It was one of the events in that data-center that pulled everyone together.
The team was greater than the individual. It was like a family in IT at GMAC-RFC. And management helped foster that kind of feeling. If you figure out how to recreate that culture and it is in place, please let me know if you are hiring. Because I would definitely want to work there 🙂
Nice Post Sparky!
Thanks Tim! I do attempt to re-create RFC IT culture wherever I go. I achieve it (along with others) to some degree, but I agree with you. What we had was something special. Not at all common to find.