Measuring data center efficiency – a facilities manager’s perspective

Posted by on Feb 26, 2012 in Facilities Management, Site Improvement | 2 comments

Even though we struggle with the concept of data center efficiency, there have been great attempts to evaluate this elusive, magical term. For example, I can measure the energy in and the energy rejected and thereby calculate the amount of energy used. I can even tell how much of this energy was used by the data processing equipment specifically. What I cannot tell, however, is the efficiency of the data center – the ratio of how much work was performed as related to the energy used. I can tell that the data center used 8 megawatts in a month; but since I can’t relate that to the amount of work that was done, I simply can’t measure the data center’s efficiency.

The problem stems from how you measure data center “work.” So what does a data center do? If you ask people, you get a myriad of answers. The CFO’s perspective is that it’s the amount of money (revenue – costs) that you get from the operation of the data center. The CIO’s viewpoint is it’s the amount of data that is received, sent, processed, and stored. The engineer looks at it as the amount of total energy used for IT equipment versus that used for other functions. In a sense, they’re all correct. They each have their responsibilities and perspectives about what the data center is supposed to deliver.

So from different perspectives you have:

  • Monetary or economic efficiency (MBAs should remember this): It’s basically the ability of the money invested to make more money.
  • Data-processing efficiency: This is a rapidly changing concept which includes some of the factors mentioned above. Different people measure it in different ways. Basically, it has been broken down into work units (mostly self-defined) per the cost to perform. As an example, the number of processing cycles per cent of cost.
  • Engineering efficiency: This too has taken many forms and is still in flux today. You can look at electrical usage (PUE), water usage, carbon footprint, et cetera.

Each of these metrics has their use; each could use some improvement. Each metric was designed to put the process into terms that help the developers understand the efficiency. One of the purposes of metrics is to compare data with other similar processes or before and after changes. But the issue I have with metrics is that they need to be focused at the level where the leadership has some ability to adjust the factors affecting the metric. For operations, why measure it if you can’t control it or use it to control other processes?

When I look at the economic efficiency of the data center, it is difficult for the facilities manager to look at the ratio and relate that directly with what they are responsible for. This also goes for many of the data processing efficiency metrics – again, there is no relationship to the responsibilities of the facilities manager. So what do we need to provide to facilities managers to give them a metric they can use to change processes under their control to increase resource-utilization efficiency?

The first thing we should identify is what resources are used by data centers from a facilities perspective:

  • Electrical power
  • Water
  • Natural Gas
  • Diesel Fuel
  • Solar
  • Wind
  • Geothermal
  • Other Fuels (waste oil, wood, et cetera)

But wait! There’s more…

  • Manpower
  • OPEX
  • CAPEX (Depreciation)

…and probably more

Imagine if a facilities manager could go to their computer each morning and see the efficiency ratio of each of these resources in live time and averaged over a desired periodicity. They could have goals that were tied to them, could vary operations to maximize each resource as needed. They would know where to concentrate their efforts and priorities. What a wonderful world it would be.

Okay, back to reality. So what can we give the facilities managers that would help them?

Most data centers use massive amounts of electrical power provided by someone else such as a power company or a government entity. This massive usage has focused attention on the development of a metric that can be recognized as a standard. One metric that has received a lot of study and scrutiny in recent years is The Green Grid’s PUE or Power Usage Effectiveness metric. From the standpoint of usefulness to the facilities manager, this metric works well.

Metric #1: PUE = Total Facility Power/IT Equipment Power

There are systems today that are capable of measuring, calculating, and displaying the PUE metric in live time and averaged over any selectable time. This a great help to the facilities manager who can make supporting system operational adjustments to decrease this ratio (the closer to 1.0, the better). Every time the facilities staff finds a way to reduce the amount of power that supporting equipment uses (without reducing IT loads), the PUE goes down. Doing things like running two chillers at 90 percent versus three at 60 percent could make significant changes in the PUE.

More and more, data centers are going to evaporative cooling to remove the waste heat from the facility. In most cases, this is using the process of evaporating water to cool the building. This process can use tremendous amounts of water, which is probably not a huge issue in Seattle where it falls out of the sky ten months of the year; but if you’re in Phoenix, water is a valuable resource to be conserved and it is a major concern.

I looked at The Green Grid’s Water Usage Effectiveness (WUE), but it uses the annual water usage as a factor, not very “real time” when you’re trying to monitor and work with it. So here is what I’m recommending:

Metric #2: Water System Efficiency (WSE) = Actual water usage/Design water usage

When the system is working at maximum efficiency, the WSE should be 1.0, but this can be affected by many factors. Simply put:

Design water usage = Evaporated water + Draw-off water + Windage (drift) losses

Each of these can be calculated for the design of the tower using temperature inputs, flow rates, relative humidity, and concentration cycles required. With an engineering study, you can develop a curve of maximum efficiency that can be programmed into the monitoring system and compared to actual water usage to calculate the WSE.

With a metric like this, it would be a very easy thing to see that the drain valve was stuck open or that the cycles were improperly set or other issues causing excessive water loss. Theoretical models could be run to determine if using a water softener system would be a cost-saving option.

The rest of the metrics for fuels and energy sources can be developed using the idea that a real-time indication of efficiency is usable by the facilities team to modify parameters to maximize efficiency and reduce costs.

Some efficiencies cannot be calculated in real time. When you are dealing with manpower, CAPEX, or OPEX the best periodicity may be monthly, which is what I would recommend.

The manpower efficiency (ME) ratio might look like this:

Metric #3: ME = Man-hours actual for planned activities/Man-hours planned

Where,

Man-hours (mh) planned = mh(on-watch) + mh(maintenance) + mh(projects) + mh(training) + mh(required administrative tasks)

One of the types of hours not accounted for is unplanned man-hours for repairs. Planned equipment replacement hours are listed in the projects or maintenance. Unplanned repair hours will subtract from actual for-planned activities and ME will be less than 1.0. An ME less than 1.0 means that the hours are not spent on planned activities; an ME of greater than 1.0 means that you are probably working overtime to “make ends meet,” in which case you probably need additional staffing.

For facilities managers, I have discussed and outlined some metrics that can be used to improve overall data center performance. As you can see, for data centers to become more efficient in their operations, metrics need to be developed for those who can make a difference and so they can observe the results of their changes and discover when opportunities exist. I don’t believe that there is any one metric that can be developed that can be used by everyone. We simply have too many different perspectives driven by our different responsibilities and, in actuality, that’s a good thing. Data centers are complex, and different perspectives give us a better chance to operate them at their maximum efficiency.

2 Comments

  1. Good morning Terry,

    Can you please give us more informations or examples about the ME metric.

    Thanks.

    • Hi Atallah,
      Thanks for your question. Basically, any metric that measures efficiency is the amount of the resource you use to accomplish what you intend to do and how much of that same resource you had to actually pay for. An example would be something like this:

      Work done/Manpower expended = Manpower efficiency.

      We pay for an individual to be at work from 8 am to 4 pm (8 hours’ pay), but we don’t get 8 hours of work. Time is used for breaks, lunch, training, meetings, paperwork, and other administrative tasks. Even if we just take out the breaks – 15 minutes for a morning break, 30 minutes for lunch, and 15 minutes for an afternoon break – you can see that we are paying for 8 hours and only getting 7 hours of work out of a person (in other words, 7/8 or about 87.5 percent efficient.) That’s if we are looking for work efficiency (WE).

      I talk about manpower efficiency (ME) as what are the planned activities for an individual versus what they actually have to do. Many of the planned activities in mission-critical environments are not work, per se; but they are required nevertheless. Training produces no “work” but is required, and the same may be said about “making rounds” or completing documentation; however, these are necessary, planned activities in our business. The individual is doing what they are paid to do. What we don’t want to see is that they need to work on repairs of a non-planned nature (means that we aren’t maintaining the plant), need to work overtime (poor planning and use of manpower), or need to do other activities that are not beneficial to the organization (sleeping, playing games, surfing the Internet, etc.)

      I hope this provides a little more clarity about manpower efficiency.

      Terry

Leave a Reply to Terry Vergon Cancel reply

Your email address will not be published. Required fields are marked *