Enhancement of secure 24/7 operation in data centres
Fig. 1 Source: Dr. Jan Meyer, TU Dresden
Need for power quality monitoring due to mains dirtying by non linear consumers and decentralised feed-ins
The volume of digital data is increasing swiftly and continues to grow. One of the main causes are cryptocurrencies like Bitcoin (XBT), Ether (ETH), Litecoin (LTC) etc. or blockchain technology, which to a certain extent, is like a type of database which runs on several networked servers. The growing demand for data exchange is also spawning the growth in the number of data centres, and they are being planned, built and maintained at massive scale around the globe. However, data centres have to deal with complex challenges in terms of the supply of electric energy, which can have an effect on secure (and legally- compliant) 24/7 operation.
Various studies have shown that mains power quality problems generate costs that run into the billions each year. Already in 2007, the Pan-European LPQI Power Quality Survey 2007 estimated that the damage was equivalent to 150 billion Euros annually. Meanwhile, the challenges for everyone have continued to grow. And this is especially true for data centres. The reasons for this are as follows:
(1) Mains power noise:
- A strong increase in the number of non-linear consumers (LED lighting, computers, chargers, frequency converters, etc.) that generate harmonics
- Increase in the number of feed-in power sources (for example wind power and PV systems), which lead to instabilities in the voltage levels
(2) Effect of noise:
- The more recent types of equipment (for example servers, control systems, lifts, fire alarm systems, etc.) are more sensitive to noise and prone to failure
- Entire components can be destroyed
- Operational downtimes are financially very costly
- Helplessness when systems fail, because the causes are often undetectable
- The measurement instruments used may be unable to detect disturbances well enough
- No data is logged because the measurement instruments are also disrupted
- In the final instance, experienced, expert and expensive personnel are needed to perform a cause analysis
As described earlier, power quality problems can generate disturbances and system failures which inevitably result in additional cost, time and effort. Power supply noise represents a non-negligible risk (see figure 2) especially for data centres, which are equipped with capital-intensive redundancy (for example UPS systems, generators, multiple power sources) and which normally ought to prevent any downtime or potential damage. Ideally, all the installed equipment should be compliant with the harmonic distortion and noise immunity standards, so it is probable that they can operate trouble-free. However in an unfavourable operating context, for example a large number of similar consumers, asymmetrical net loading, etc., significantly excessive noise levels can occur. To be able to estimate the risks as well as limit them on a permanent basis, power quality monitoring is thus essential. Depending on the structure and topology of the data centre, it can make sense to monitor several different points in the energy supply.
- At the feed-in point of the mains power supplier known as the PCC (point of coupling)
- In all the protected supply areas
- At the feed-in point of backup power supply systems
In addition to the analysis data shown in figure 3, power quality data measurements make it possible to detect existing or developing problems sufficiently early, before they can lead to damage.
To measure the conformity, the statistics recorded are compared to standard boundary values. For data centres these are:
- EN 50160 (voltage characteristics in public distribution systems), which normally serve as the basis for the contract with the energy supplier
- IEC 61000-2-4 (compatibility levels in industrial plants), particularly Class 1 (protected supplies)
The above-mentioned standards give guidelines on how the mains power should behave at the monitoring point in normal operation. Exceptional situations which could lead to the energy supply being temporarily restricted are not covered. It is compulsory to record these types of disturbance, such as voltage dips or interruptions, however there is no specified limit to their number in order to be compliant with the standard (see figure 4). The function of UPS or backup power supply systems is to be able to compensate for any power supply constraints. These backup solutions are however limited to the most important resources, so it may be that other components operate with only reduced functionality. For this reason, it is critical that operating personnel are quickly informed when an IEC 61000-4-30 type event occurs. This can be implemented by an automatic notification sent to the competent people by email for example.
Fig. 3: Source: own diagram
Schematic data structure of a power quality monitor with statistical analysis
Fig. 3: Source: own diagram
Schematic data structure of a power quality monitor with statistical analysis
It is worthwhile to use a standardized format to exchange power quality data, for example PQDIF (Power Quality Data Interchange Format), which is IEEE 1159.3 compliant.
As a result, the range of available software for analysing the power quality data is not limited to proprietary systems from manufacturers.
A further aspect: RCM
To prevent operations being interrupted unexpectedly, data centres avoid using residual current monitoring devices (RCDs) with direct triggering. In the contrary, it is compulsory to monitor residual currents on a permanent basis (see figure 5). In this case, RCM (Residual Current Monitoring) is used, which as well as its essential function of protecting people's safety, also protects the systems from damage and aids fire prevention. Furthermore, changes in residual currents can provide an early warning of any deterioration in the insulation and allow corrective measures to be undertaken. Errors occurring in the TN-S system (for example illegal or additional PE-N connections can also be in detected early and corrected as a result).
Correct measurement data through metrological traceability
An old locksmith's saying is "centimetres are a clockmaker's measurement". Put another way, "he who measures, measures garbage". Technicians and economists know this and they take heed of this well-known and still useful warning, making sure they use the right method for each type of measurement. However in spite of the fact that the required specifications of a power quality device are precisely defined in terms of measurement methodology (IEC 61000-4-30), device characteristics (IEC 62586-1) and validation of compliance with the standards (IEC 62586-2), there are still differences between the manufacturers. Suppliers are often unable to prove why their analysis device meets the specifications and measures correctly. Proof of a truly correct measurement can only be obtained from an independent certification lab, ideally a metrological institute. Non-certified test labs, or even the manufacturers own statements cannot be a substitute for metrological certificate and thus should be viewed with a critical eye. This is especially true where sensitive activities such as data centres are concerned, which are associated with high costs and risks.
Fig. 4: Source Own diagram
Voltage events do not impact the compliance assessment
For example, Camille Bauer Metrawatt AG requested the METAS (Federal Institute of Metrology) in Switzerland to perform an independent certification.
The institute cannot provide the norm for every recognised measurement unit, instead it refers its own measurement equipment verifiably, and traceably back to the SI basic units. This guarantees that the measurement data cannot be called into question at any time.
Fig. 5: Source own diagram
Residual current measurement for personal safety, protection of equipment and fire prevention. Monitoring the TN-S system (for example current in N-PE bridge or at the central earthing point).
The benefits of metrologically-certified power quality monitoring
The main benefit of a professional and permanent power quality monitoring is the increased operational availability of data centres. Whereby power quality is defined as a key component of supply quality (see figure 6) and is naturally applicable for many other sensitive areas other than data centres (for example hospitals, sensitive industrial sites, in transport infrastructure such as airports, publicly-accessible building complexes such as shopping centres, etc.). The benefit is obtained by analysing the recorded long-term data to observe the changes and identify correlations. Compliance with the contractual supply guidelines is just one of the important aspects. Additional relevant information can be obtained from the following procedures:
- Comparison of normal operations with UPS or emergency current operation
- Evaluation of harmonics and their effect on the equipment
- Evaluation of the changes in network quality over a longer period
- Changes to the network quality after the changes in the installation
- Changes in network quality after switching equipment on or off
- Evaluation of voltage events according to duration and residual voltage (ITC curve) and their effect on the service life of the equipment
Another specific benefit comes from a permanent RCM. By continuously monitoring residual current correctly, it may be feasible to eliminate the periodically recurrent and manual testing of the insulation resistance. This eliminates both the need to switch the installation off while testing (thus increasing availability) and saves the massive time and effort needed for the test with the associated costs and people required.
Drawing the correct conclusions from metrologically-certified power quality monitoring including RCM, results in the durable protection of investments, reduced operating costs, maximisation of data availability and very importantly, the satisfaction of all stakeholders. These include customers, employees, energy suppliers, operators, investors, service contractors, authorities, associations, etc. And lastly, it helps to lower CO2 emissions because it makes it possible to operate the data centre more efficiently and securely.
When we observe the development of global data volumes, we can see that the challenges for planners and operators are likely to become even greater in future. In China alone the 500,000+ existing data centres are forecast to increase to 1,000,000 by 2023 (an increase of 21% per year). When considering all the aspects of power quality, we have to focus more and more on the question how this growing energy demand can be regulated in terms of PUE (Power Usage Effectiveness), because the energy infrastructure and the required building surfaces may reach their limits. As the American scientist Jonathan Koomey has written, the share of data centres in the worldwide energy consumption is now already approximately 1.1 to 1.5%. In the region around Frankfurt alone, data centres represent 20% of the total energy consumption.
"Server performance versus electrical power” - what specific steps can be taken to improve PUE? With this in mind, the recommendation is to monitor energy usage per the amount of data and match it to the point of service in the installation and then to transfer this information directly into the billing model of the energy supply, as well as into the customer side of the data exchange.
In this way, the price of the energy is defined in a real "data consumption model" and potentially heightens the awareness of suppliers of data and their users. Data might then be used more economically, because of the actual energy costs generated. On the technical side, an integrated energy monitor could be envisaged, based on a reproducible reference model (measurement standard definition) which monitors individual servers, racks, or similar data processing equipment and measures the actual energy per data unit at each point of service and then transfers valid data for invoicing.
A further aspect to be considered is whether continuous and qualified monitoring of the mains power could also be used to prevent cyber attacks on the energy supply of data centres or other sensitive activities. This would be quasi-redundant to the existing monitoring systems which are already implemented as software solutions, but which are however subject to enormous dynamics. The purpose is to research whether a connection can be found between changes in power quality and cyber attacks on a data centre's servers and infrastructure, and as a result be able to fend them off early.
In both cases, server performance versus electrical power (PUE) and as an extra defence against cyber attacks using network quality analysis, the reference values (definition of the measurement standard) will be critical.
Fig. 6: Own diagram
Formula for electrical supply quality
When planning the energy supply of a data centre, many requirements have to be taken into account:
- Secure site location in terms of energy supply and environmental conditions
- High energy efficiency in order to minimize operating costs
- Maximum availability through the use of redundancies (UPS, generators)
- Highly secure (fire protection, access, defence against cyber attacks)
- System stability and reliability of the equipment used
- Possibility for later expansion
Power quality characteristics compliant with IEC 61000-4-30, Parts 5.1 - 5.12, Class A:
"In the context of system stability and reliability"
- Power frequency
- Magnitude of the supply voltage
- Supply voltage dips and swells
- Voltage interruptions
- Supply voltage unbalance
- Transient voltages
- Voltage harmonics
- Voltage interharmonics
- Mains signalling voltage
- Rapid voltage changes (RVC)
- Underdeviation and overdeviation
- Current (level, harmonics, interharmonics)
More information available from www.camillebauer.com
Ask For More Information…