A few months ago there was a much-criticized NY Times article about the impact of data centers on the planet. While the writer was a bit misguided as he pursued an out-of-context hypothesis (“data centers are bad for the planet”), there were a number of comments from readers suggesting that a valid adjunct would be some new discussions regarding ALL of the ways that the huge number of existing and planned data centers could be made more efficient.
So with 2013 approaching and with much of the data center industry still looking for the ways to make their IT structures and in specific their data center more efficient, I decided to compile a laundry list of the most popular approaches and projects I have recently discussed with customers.
It is clear that there are just so many things that can be done. Below is a list of relatively low-hanging fruit (for the adventurous) and deals with retrofits and optimizations that can be made TODAY to existing data centers. The end of this post talks a bit about new ways of designing new data centers which challenge conventional wisdom. (e.g. Why do you need raised floor in a Data Center anyway?)
At the end of the day, everyone should look at this list and determine which pieces make sense to them. Add things to your radar screen that I didn’t list. Above all, look to 2013 to take action. Specific action in one or more areas discussed below.
1. Blanking Panels. Really? Are we still here? Blanking panels are one of the most overlooked optimizations in the data center for many. Such a low-tech solution to a problem that may have been overlooked. Simply put, blanking panels eliminate the mixing of hot and cold air. Cold air is expensive to create, so the trivial investment to force this precious commodity to flow THROUGH active equipment (rather than around it), makes a lot of sense. On a case study by Upsite, just 400 panels were installed in a high-density 10,000 sq ft data center, which yielded $137K in total annual operating cost savings. Blanking panels cost less than $5 for a 1U size and perhaps $40 for a 10U. They are a bargain! They come in many sizes, so no need to put lots of little ones together. Just DO IT! Don’t miss this simple way to save BIG bucks on your cooling costs.
2. Remove hardware KVM. The days where single-points of failure due to operating system crashes and hardware failures in the data center are mostly gone. New software applications in a data center tend to span many servers, so even in the event of a device failure, production never ceases. Sure there are exceptions, but in the lion’s share of situations, KVM technology at the hardware level is not needed. In-band access technology (such as IPMI, SSH and RDP) are now commonplace and software KVM startups that enable secured remote connection like Tenduit are popping up, so the need for extra out of band hardware and its non-trivial power consumption are all but obsolete. You probably forgot that those KVM ‘dongle’ devices took power huh? How much depends on brand, but the switch and the dongles ADD UP! Perhaps a 100 Watts or more per rack! Consider removing KVM hardware.
3. Top of Rack (TOR) Switches. Now all the rage in high-capacity Data Center networking solutions, TOR switch solutions are very high-speed switching technology that can be deployed to aggregate traffic within a rack and then allow full connectivity to the core at rated wire-speeds. Many of these boxes work with OpenFlow today, so the power-hungry intelligence previously found INSIDE switches can now be centralized and optimized elsewhere. When considering your TOR vendor, consider one criteria that might not be so apparent: The amount of power each TOR switch consumes. Remember that most of the traditional network vendor TOR switches use general purpose fiber optic lasers which are intended for long-distance communications and therefore consume large amounts of power to drive those long distances, but in a TOR configuration, the distance inside a RACK is 10-feet or less. No long-distance! Why burden your power budget with long-distance optics? There are a few startups like Plexxi that took this into account when then built their low power laser TOR switches. In any case, since TOR switches proliferate in every rack, find the most energy-efficient switch that works for your application.
4. Raise Data Center temperature. This is a very emotional topic, with lots of expert opinions, but the challenge to do something should NOT be avoided. Data Centers used to run cold, really cold. I have seen areas of a data center running at 58-degrees or less. ASHRAE and the server and networking manufacturers have all written countless papers on recommended cooling approaches but let me summarize a ‘first step’ that works almost everywhere: Raise your temperature of the data center to 75F. Don’t get crazy because too much of a good thing may create other difficulty, but taking everything written into account, a safe target temperature for the data center that will NOT generate any ridicule from the peanut gallery is 75F-degrees. No more, no less. Remember certain staged fans within gear tends to have 76F-degree thresholds, so going higher may force some fans to run faster, which consumes more power. Mark Monroe while at Sun quantified raising Data Center temperatures by saying that for every degree that you raise your data center temperature, you save 4% of your cooling costs. Get to 75-degree in 2013!
5. Implementation of DCIM. Too long in curiosity phase in most companies. Your data center is begging to have its physical complexion managed over long periods of time. Forget the idea that DCIM is pretty pictures showing detailed racks. DCIM is the business management of your assets over long periods of time. It is the extensions to your general ledger and operations workflow systems. DCIM enables complex IT structures to be documented and evaluated for vulnerabilities or inefficiency. DCIM supports your goals to reduce MTTR and reduce human errors. It allows IT material to be quickly put into service and then quickly removed when the business value has decreased. DCIM is an enabling technology for rapid and predictable response to new situations in the Data Center. Add to it all of the mechanical and electrical components and you are starting to get the BIG picture. One of the not-so-apparent but high-impact results of getting your Data Center house of cards in order is the direct ability to identify servers that have been orphaned which some studies estimate at 20% or more of all server power being drawn in the data center. You’ll also find stranded capacity where power is available but no cooling, or visa versa. Get serious about DCIM in 2013 and take a look at offerings from companies like Nlyte Software, RFcode, FieldView and NoLimits.
6. Increased Virtualization. OK, circa 2004 virtualization was just beginning. Early evaluations and pilots. It didn’t break. Circa 2007, Virtualization began limited production use in certain applications, most IT pros still cautious. 2009 the flood gates open and every manager runs to figure out how to virtualize, and the hypervisor market gets competitive with entries from Citrix and Microsoft joining Vmware. Virtualization servers today commonly run at 6 guests and the server itself running at 20% capacity. Here’s your 2013 opportunity: More guests, more work, etc. Remember that the amount of power required to turn on a server starts at 65% of its rated value, and increments approximately linearly from 0%-100% load. Your biggest opportunity in virtualization is to make those hardware servers WORK. As hard as possible! Going into 2013, your goal is to get the utilization of hosts used to virtualize guests up to 60-70% or greater.
7. Selection of Rack-based PDUs. Every rack has one. In fact, most racks have two! These are the vertical power distribution devices that attach to upstream power and provide the required numbers of outlets in every rack. Typically 24,30 or 42 outlets, and commonly deployed in redundant power pairs. There are two choices here; intelligent and non-intelligent. Two schools of thought. In the first, the idea is to use the most intelligence PDU you can find, and let it instrument power consumed by the attached devices extracting metrics to be fed up to DCIM and monitoring solutions. If this is the direction you plan, consider finding the most POWER efficient PDUs available. I have seen PDUs that consume as little as 1-2Watts, and others that consume as much as 25Watts (or more if outlet switching is included)! Remember, if you have two PDUs per rack and say 500 racks, then this power adds up noticeably. 25,000 Watts costs REAL money! The second camp is only considered by users of data centers filled with mostly new equipment, and takes the opposite approach. It eliminates the intelligence altogether in the PDU, effectively relying on the connected equipment for their own embedded instrumentation and power control, and entirely eliminates the PDU power budget in the process. These PDUs don’t even have an Ethernet port on them, no controller, no display. Just the delivery of high-reliability power to each outlet. Consider what makes sense for EACH of your data center(s). One size does NOT have to fit all and the energy savings can be real. For 2013, regardless of which camp you choose, choose QUALITY PDUs which will last a lifetime running in demanding situations. ServerTech comes to mind as a PDU industry quality benchmark.
8. Replacement of older IT gear. For years, the door on the data center has been considered ONE-WAY… equipment went IN but never came out. It has been much easier to simply leave gear in place, happily blinking its lights and spinning it’s disks. All too common do we find huge populations of equipment in the data center which is 3-5 years or older. It has simply been too easy to just let things stay in place, and if it didn’t break, just let it sit. Well, this is BAD, REALLY BAD. The accountants cringe when looking at this type of aged equipment (whether purchased or leased) as it costs BIG dollars to keep it going. All of the TAX benefits have vanished, AND worse yet, the devices typically take TWICE the amount of power to do ONE-TENTH of the amount of work as a modern counterpart. For example a 4 year old server may consume 600 Watts and be able to handle 70,000 transactions per second, while a modern replacement server may consume just 300 Watts and process 700,000 transactions in the same application. Move out your old gear! (More toys doesn’t always win!). In 2013, ‘Go Business’ and identify gear that needs to be refreshed for technical or financial reasons. More work in less space. You WILL be a hero!
9. Implementation of new approaches to data storage management. Standard storage is growing at an unbelieveable rate of more than 50% PER YEAR in most organizations. Continuing to deploy more storage created by your applications has become a huge cost center, and the amount of energy required to power all the spinning media is becoming awe-inspiring. New approaches need to be adopted, including the use of solid-state media for certain applications, data de-duplication in most applications, a well conceived BaR strategy based on user requirements, and a tiered approach to storage hierarchy with migrations. Essentially you want information to be quickly available IF it is likely to be needed. Information less likely or less timely can be less available. In this context “Availability” can be directly related to power consumption. The quicker the availability, the higher the power to maintain it. For 2013, look at your data carefully to optimize HOW each type of information is stored, eliminate the duplication of data (always), and profile your usage types carefully.
10. Granual cooling with feedback. The Building Management System (and Building Automation System) was a great start 20+ years ago. These hardwired approaches to keeping rooms cool were started in the years when rooms housed people, not technology and the ‘climate’ was fairly consistent across the whole space (and only drew ‘complaints’ when it wasn’t). These rooms are now data centers and the number of climates can range into the hundreds. They are now “micro-climates”. Hence, the BIG opportunity when trying to save energy used for data center cooling is to try and match a specified target temperature for each micro-climate. Remember the only temperature you REALLY care about is that of the air at the INLET of the devices. In 2013, a bunch of new options exist, and data centers can easily be adapted to accommodate. At the core of these cooling approaches, will be a granular sensing capability and then the intelligent interpretation of those sensor points into cooling rules. Most major gear vendors now internally instrument their inlet temperatures and provide this programmatically. Startups like RFcode exist to externally instrument the data center. Finally startups like Vigilent have formed to interpret these micro-climate temperature readings and subsequently create and execute sets of commands for cooling devices to change their operation dynamically. For 2013, consider doing anything which allows a highly granular view of micro-climates (the use of LOTS of internal and/or external sensors) and couple it with a system that TAKES ACTION based upon what is seen. Remember that actual energy savings comes from the ACTION part of the story.
11. Intelligent lighting has always been misunderstood. While we tend to forget about lighting in the big picture of energy, lighting in a commercial building can account for 2%-7% of the total energy consumed. Obviously lighting does nothing when the data center is unoccupied, or any portion of the data center is unoccupied. Equipment has no eyes. Imagine the analogy if you eliminated all of the switches on the walls of your house, and simply hardwired everything ON. Your entire house and all of its lights ON all the time, 24×7. The data center is just like your house, but without the comfy couch. No one in their right mind would want to pay that bill every month. Data Center intelligent lighting systems are now coming of age and companies like Redwood Systems are showing how low-voltage motion sensing and control, optionally coupled with low-power LED fixtures can make a huge difference. While these systems can be a bit pricey up front, they are simple to deploy (low-voltage so labor costs are low) and there is a tremendous opportunity to shave the dollars. Is it right for everyone? Not always, but you owe it to yourself in 2013 to challenge your knowledge on Lighting. Start with the Data Center and then expand to the rest of your structure(s). Remember, rebates exist for energy savings!
12. Power Stepping, Dynamic capacity. Intel and AMD have been talking about their CPU chip power strategies for a DOZEN YEARS NOW, and yet the vast majority of us really don’t know enough about it, and in many cases, don’t even enable it. Disappointing because the x86/x64 CPU chip itself is one of the biggest consumers of power inside a server. Stepping or throttling is a chip and Operating System coordinated effort to reduce CPU capacity based upon load. In most cases the O/S and CPU can reduce CPU capacity by 60% or more by reducing frequency and voltage. Keeping in mind that CPU power is calculated by CV^2F (where C is a constant, V is voltage and F is frequency) you can see that reductions in Voltage or Frequency can yield REAL savings in consumed power. With Xeon CPU running 130Watts or more full bore, and many industrial servers include 2 or 4 of these CPUs, it can really add up. Take a look at the throttling and Power Stepping technologies which are native to your operating system (yes, Linux and Windows). Make sure you have it set properly to take advantage of these features. Then go even farther and look at solutions like Power Assure’s EM4 which drives this optimization even farther. Keep in ming that at 2AM your commercial processing demand will be dramatically different than that which is required at 2PM. It just makes sense to investigate how to dynamically address this huge swing in demand. Think of 2013 as your opportunity to more closely match demand of processing to supply of processing.
Now, last but not least it make senses to have a different set of discussions when considering NEW data center builds. As a general rule of thumb, the facility and mechanical/electrical for a new data center build-out (Tier1) will cost $1000-$1200 for each square foot of space, and then another $5000-$6000 for the active material to fill it. Want a Tier3? That’s $3000 per square foot. Now do the math. BIG dollars! A 10,000 square foot Tier3 might run $30-$40Million in Facility, and then another $50-$60Million in gear. That’s $100 Million! With investments like these, you have to think it through very carefully and look for ways to optimize. Challenge traditional Data Center design thinking.
For example, consider using concrete slab construction with overhead cabling and cooling. There is simply no need for raised floor these days. Racks are too heavy and cooling demands too high. Physics 101: COLD air sinks. Why on earth did we create a raised floor based data center design 25+ years ago that forced physics to work against us? Our traditional and silly approach: Create COLD air and push it UNDER the floor, then use enough energy to create enough pressure to overcome physics to force it UPWARD in the room. Makes NO SENSE! I toured a Co-Lo Data Center earlier this year where they had all of their cooling HVAC equipment mounted above the data center, and all of the COLD air was simple ‘falling’ downward to the equipment. Amazing the difference in pressure. FANS were dramatically fewer and slower. Energy was lower. (Remember FANS account for 47% of your cooling power budget in a traditional data center).
You can also look at UPS approaches. For UPS systems, a tremendous number of options exist today. Keep in mind that traditional battery based UPS approaches require BATTERY replacement every few years. Big deal? $Million in batteries every few years IS a big deal. Not to mention the HAZMAT status of battery rooms, etc. Consider Flywheel approaches like those from VYCON with 20-year lifespans, and overall savings in as little as 3 years.
What about FREE cooling? Not where you live? Think again. Most of us immediately think of our own data center sites and then places like California, Arizona and Dallas and say, ‘free cooling doesn’t apply to where my data centers exist’. In fact most people don’t realize that for 3 seasons a year, nearly everywhere in the USA can provide tons of free cooling, and even in that 4th summer season, many NIGHTS are usually cooler than 70-degrees. In a SKANSKA study, this amount of time where free cooling could be used translated to at least 65% of the year for everywhere in the USA. For 2013 designs, consider a mixed strategic approach to cooling. Like your car, the engine can cool at high speeds by forced air, but at lower speeds they kick ON a fan to run air over the radiator. This is a hybrid energy management approach. For your data center, it’s about blending creative hybrid strategies for cooling. Every piece of savings for cooling is real savings versus traditional always-on approaches.
For 2013, commit to projects that will increase the energy efficiency of your existing data centers. Lots of suggestions above to consider and many of these can be done in parallel. If you are considering a new data center next year, begin by challenging your current firms for ideas about high-efficiency designs. Be careful about letting the mechanical and electrical vendors that you currently use be your only guides. There may be built-in conflicts of interest. Talk with an engineering firm that you can trust and in many cases partner with. Ask around and look at recently completed high-efficiency data center designs.