I received a phone call from a local Fox News reporter today wanting to do a TV interview related to the recent Minnesota Department of Natural Resources (DNR) website problems. The Minnesota DNR has a website that campers use every year to reserve spots at dozens of state parks. The prime camping spots at these parks are always in high demand — especially during the major work holidays. The DNR updated their park reservation system this year, and launched it to the delight of eager campers hoping to win the reservation lottery. Unfortunately, the online reservation system quickly buckled from the user demand, and thousands of Minnesota citizens were unable to reserve camping spots.
The Fox News reporter and camera man arrived at our office and setup the interview shoot in our data center. The reporter asked me some basic questions: Why do website outages like this occur? How do businesses know how much power their website needs? Are 4,500 visitors an hour enough to take down a website? (This is approximately the website demand reported by the DNR).
As I was responding to the reporter’s questions, I thought about the process of website capacity planning. Unfortunately, websites failing due to unexpected demand has been a scenario I’ve seen several times over my 16-year career. The first time was in the late 90′s when a customer called to let me know her product would be featured on the Oprah Winfrey TV Show in a few hours. She wanted to make sure her website could handle the increased demand from TV viewers. It couldn’t, and it didn’t. The website hosting platform was never designed to handle the level of demand generated by the TV show. And even if the website platform could have been redesigned to support the increased demand, there was no way to provision the new website capacity in time for the show.
When companies build out their website capacity they invest in a finite amount of resources. They perform a basic risk-benefit analysis. The risk is that they will not invest in enough website resources to meet customer demand. In this case they damage their reputation and lose business. The benefit of investing in more resources is that the business can support greater demand, and create delighted customers. But greater investment also carries a risk. The business might invest in too much website capacity and spend needless dollars on non-existent demand.
In my experience, most organizations have no clue how much demand their websites can handle. Why? One of the sad truths of IT planning is that most website resource capacity recommendations are based on gut instinct. Sure, this instinct is informed by years of experience and some relevant historical data points. But at the end of the day, website capacity planning still comes down to a best guess. Enterprise architects and consultants usually build in a large enough fudge factor in the capacity plans to cover most scenarios. Companies launch their websites into production and face an uncertain future — hoping their website is successful, but not too successful.
Scientists test the thrust generated by rockets before strapping them to the Space Shuttle. Automotive engineers perform extensive performance tests on every new car model. Apple’s quality control team carefully monitors the manufacturing of its new iPad tablets with retina displays. Why do businesses launch new websites out into the wild without rigorous testing? I’m not talking about a handful of developers clicking on random website links. I’m talking about serious website performance testing — dialing up the website requests until the website infrastructure collapses.
If I’m a website owner, I want to know when my website is going to fail. Then, I can go back and do the risk-benefit analysis to see how much risk is acceptable to my business. It’s not possible to invest in an infinite amount of resources, but it’s possible to manage risk when you have a good understanding of the risk level.
I can’t say for certain if the Minnesota DNR technology team engaged in performance testing before launching the updated DNR website. I’m not in a position to evaluate or critique their process. All I can say is that if they didn’t load test the website before launch, they are in good company.