Keeping Ahead of Mission Creep
Whether by design or default, the ability of a growing number of organizations to function minute-to-minute is tied to the health of a data center. One risk is that, as reliance on a data center grows, the supporting infrastructure does not. This phenomenon is known as mission creep.
Mission creep is the expansion of a project or mission beyond its original goals after initial success. This phrase was originally used in relation to the military but has recently been applied to many different fields, including data centers. Every data center at some point will no longer be able to meet a company’s business objectives without being upgraded. The question is whether the data center will jeopardize an organization’s mission. To avoid problems arising from mission creep, a well-planned data center should have the flexibility to adapt over time. In other words, data centers should be scalable.
A scalable data center can easily be expanded or upgraded on demand. Scalability is important because new computing equipment is constantly being deployed, either to replace legacy equipment or to support new missions. Both scenarios can cause problems.
New equipment is usually more compact, which allows more of it to be crammed into cabinets. This results in power and cooling issues at the rack level that can affect the function of an entire data center. Supporting a new mission often means adding a higher level of reliability to the systems.
For most corporations, the data center sits at the crossroads of information technology, corporate real estate and facilities. Independent spending in each of these three groups prevents the communication required to optimize solutions for data center operations. The IT group deploys the equipment to support a business mission. The real estate group provides the space, and the facilities group provides the mechanical and electrical infrastructure to keep the data center up and running.
It takes constant communication among IT, real estate and facilities to ensure that as new equipment is deployed, space and systems are adjusted to support the business mission. But this does not always happen. And when it does not, mission creep occurs.
Classifying data centers
Not all data centers are created equal. There are multiple levels of reliability. And the cost difference to design and construct the various levels is significant.
The tiered classification approach was defined by The Uptime Institute Inc., a research firm based in a Santa Fe, N.M., dedicated to providing information and improving data center management. The institute has created a standard measure system for data center reliability that is based on a series of tiered benchmarks, which has evolved into a four-tiered classification system used throughout the industry.
These tiers reflect varying levels of reliability that are built into the systems and expressed in “need,” or “N.” N represents the quantity of components necessary to support the mission. For example, a car needs four tires but has a fifth tire, the spare. Therefore a car’s tire system can be referred to as N+1.
For data centers, the tiers are broken down as follows:
Tier I: There is a single path for power and cooling distribution, with no redundant components; all systems are N. The design consists of a single utility feed for power, a single uninterruptible power supply (UPS) and a single backup generator. The mechanical systems do not have redundant components, and maintenance of mechanical and electrical systems requires an outage. The result is 99.671 percent availability with an annual anticipated down time of 28.8 hours.
Tier II: There is a single path for power and cooling distribution, with redundant components. A Tier II electrical system is similar to Tier I, with the addition of N+1 components for UPS and generators and N+1 components for mechanical systems. Maintenance of mechanical and electrical systems requires an outage. The result is 99.741 percent availability with an annual anticipated down time of 22.0 hours.
Tier III: There are multiple power and cooling distribution paths, but there is only one active, redundant component, concurrently maintainable. This design is similar to Tier II, with the addition of second path for power and cooling. For electrical distribution, this design can translate into electronic equipment with two power cords connected to two separate UPS systems, and two emergency generator sources. Mechanical systems have two paths for chilled water. Maintenance of mechanical and electrical systems can be accomplished without an outage. The result is 99.982 percent availability with annual anticipated down time of 1.6 hours.
Tier IV: There are multiple active power and cooling distribution paths, with redundant components; the design is fault tolerant. This design is similar to Tier III, with the systems able to able to maintain operation despite at least one worst-case, unplanned failure or event. This is accomplished by having 2(N+1) systems with two active paths. The result is 99.995 percent availability with annual anticipated down time of 0.4 hours.
It is statistically possible to calculate the down time based on the design. An evaluation of the predictive failure rate of system components will yield anticipated downtime of a data center. Mission creep occurs when the data center is asked to be more reliable than it was originally designed to be.
Data centers are not just about footprint. Data centers succeed or fail based on the capacity and reliability of the electrical and mechanical systems. The electrical systems in a modern data center have a functional life of about 10 years — not a long time.
Too many organizations believe that a data center is adequate to support a business just because it has a UPS and generator. That idea is a prescription for disaster.
Building a better data center starts with a process of business-based analysis and design. A cornerstone of this problem-solving methodology is to satisfy today’s mission while remaining scalable to meet tomorrow’s needs. It’s also important to understand how business need drives the use of technology, and vice versa; the result is a cycle of increasing reliance on technology — and often mission creep.
Solving this problem begins with business mission validation. What are the current and proposed business missions supported by the data center? In most cases, a single data center is required to support multiple business missions, with their own requirements and associated impact on systems. Understanding and validating the business mission is the first step toward evaluating the tier of reliability needed now, and the tier that will be needed in the future.
The second step is programming, including architecture and engineering. This process matches space and system requirements with business mission and involves the data center user and facilities staff. At the completion of programming, the design team can do space test-fits and develop system options. This is where scalability enters. Understanding current and future needs, and also understanding how mechanical and electrical systems can be designed with expansion and growth in mind, allows for better cost control.
Once systems and space options are selected, the basis of design can be established. This process documents the systems and design intent — a critical step in explaining to third parties why decisions were made.
The culmination is a schematic design. This can be called the “C-level” report, a document that the CEO, CFO, CIO can review to understand how the team arrived at its recommendations and why the plan needs to be implemented.
This process was used when Prologis, the world’s largest industrial real estate developer and fund manager set out to develop a mission-critical data center. The business mission validation process identified a one- and three-year need for a Tier II facility, and a five- and 10-year need for a Tier III facility. The information systems function is expanding from electronic messaging, financial accounting and property management to include customer relationship management, fund management and consolidated Web hosting.
The programming phase revealed that, in addition to the need for more systems reliability, there will be a fourfold increase in anticipated power and cooling requirements over the 10-year business plan. Prologis needs a Tier II data center in 2005 to support 32 cabinets at 2 kilowatts per cabinet, and a Tier III data center in 2015 to support 64 cabinets at 4 kw per cabinet. The basis of design is a mechanical and electrical solution that meets the 2005 requirements, but is expandable, without an outage, to accommodate the future needs.
The scalable solution provided significant benefits. “Viewing data center requirements from a business mission perspective allowed us to focus the design on critical elements over the lifetime of the facility and avoid costly and unneeded data center capabilities,” says Neville Teagarden, Prologis. “We also avoided significant future retrofit expenses by including components that allow us to cost-effectively install future required capabilities.”
Health care remedy
In the health care sector, the same fundamentals apply but with different system considerations. Every hospital has a data center of some size or shape. Key factors that dictate the size of the data center are the scale of the institution and whether computing is distributed or centralized. A large hospital with central computing might have a 10,000-square-foot or larger data center. Hospital data centers historically are designed to support payroll, billing, patient records and maybe research. A first analysis might indicate that a Tier II data center would support the mission.
A Tier II data center, however, would be vulnerable to mission creep. Archiving and communications systems (PACS), also know as digital imaging, generate images that need to be stored in the data center and then communicated to the clinician for diagnosis and the operating room for surgery. With the addition of digital pharmacy and clinical communications systems, a Tier III or IV facility is required. The data center has made the leap into the clinical world. The patient’s well-being is now linked to the reliability of the data center.
Ultimately the challenge is to determine if mission creep is affecting a data center. There are two ways to find out: Either do nothing, and wait for the inevitable crash, or conduct a business-based analysis of the data center.
The right solution relies on communicating with IT, understanding what mission needs to be supported, finding the right real estate, determining the best place to house the systems and facilities, and determining what mechanical and electrical systems are needed to support the mission. The result will be a data center that predicts and addresses the company’s needs — even before they happen.
R. Steven Spinazzola is the vice president in charge of the applied technology group at RTKL, an architectural and engineering firm. Based in the firm’s Baltimore office, the applied technology group provides integrated architecture and engineering services to mission-critical facilities.