How to Migrate a Data Center in 182 Easy Steps with Agile and DevOps
Bird migrations just happen naturally. Data center migrations… don’t. Few projects in IT are as big of a deal as a data center migration. This is one of the most complex, highest risk, and potentially disruptive things IT can do to a business. I’ve had the opportunity to do several data center migrations over my career, two of which in the past two years at CHS.
Traditionally, data center migrations involve trucks. You dismantle systems in your old data center, pack them up, put them on a truck, drive to the new data center, unpack, rack, cable, and fire them up. Under the best of circumstances, with tight planning and choreographed moves, you are offering your business days of down time, without much of a back-out plan. My first data center move was conducted over the course of 4 sequential marathon weekends, where we took an outage Friday night and worked like crazy to get systems back online for the start of business on Monday. We even sustained a cyber-attack in the middle of it. I chronicled that story here.
I’ve also had the opportunity to migrate data centers that were part of a long-distance dual data center replication topology. This is better for managing business down-time, but it’s still a race against the clock. You can’t let your data replication get too far behind, otherwise you’ll reach the point of no return, never catch-up, and need to reseed. No one wants to let that happen, so you work like crazy in a tight window to get it done.
At CHS, we found a better way. We needed to migrate out of our primary data center on-premises in our office building into a pair of state-of-the-art colocation facilities. There were a couple of unique factors that gave us the opportunity to take a novel approach. First, we had high-bandwidth, low latency, optical wavelength connectivity between our old and new facilities. Second, the equipment in our old facility wasn’t worth moving. We were able to do a total technology refresh and build out an all-new hosting environment to receive the workloads.
These conditions gave us an idea. Because of the tech refresh, we didn’t need trucks. Because of the optical wavelength, we didn’t need make our moves in large groups. This looked like the perfect opportunity to apply Agile and DevOps principles to our data center migration. The team broke the workload inventory into small move-groups. In my first data center migration, we had four move-groups. In this one, we had 182.
The team tracked their velocity and the burn down of workloads to be moved. They moved a workload, tested it, and rolled it back if it didn’t work. We used this an opportunity to improve the environment. I wanted this data center migration to be an upgrade in our application hosting capability, not just a real estate transaction. We removed IP dependencies, increased availability, and improved security with each move.
We had a target architecture, a timeline, a budget, and a team. The details and specifics were very fluid. Move-groups changed and re-ordered as new information came to light. New ones appeared, and others went away. The team maintained flexibility, tight collaboration, and constant communication, like a flock of migratory birds moving across the autumn sky.
This method required tight alignment and collaboration between development and operations. We fully leveraged our DevOps toolchain to move workloads from one data center to another. Many of the cut-overs were simply a DNS change for the load balancer virtual interface.
This was a year-long effort. Sure, we could have done it faster, but we had a higher goal of managing risk to the business. This single project had 182 releases to production. Something moved every few days. Now that’s Agile!
Towards the end of the project, we got into crunch-time mode. Time was running out, and we had some difficult oddities left for the end, as is the case with most projects. We got everyone in a room to bang-out the last month of work and get it across the finish line. This might be typical for development efforts, but it somewhat uncommon in the world of infrastructure projects. It was fun to see how productive a team can be when all distractions are removed.
That’s how to migrate a data center in 182 easy steps. Not everything went perfectly. There were some bumps along the way, but the impact was very contained. In the hallways of IT and the business, the data center migration project was “noticeably unnoticeable.”
I’m proud of our team. It wasn’t the effort alone that made it successful, but the novel approach. This is a shining example of how development and operations can work together to achieve great outcomes. This is our new way of working infrastructure projects. I have no interest in going back to the old way.