between air handling units (AHU) and variable air volume (VAV) terminal units. ... Heating, Refrigerating, and Air-Conditioning Engineers (ASHRAE) developed the BACnet, a data .... In Table 2, the horizontal and vertical axes represent ...
Data-Driven Framework to Find the Physical Association between AHU and VAV Terminal Unit – Pilot Study June Young Park
Bertrand Lasternas
Azizan Aziz
Student Member ASHRAE
ABSTRACT With improvements in sensor technology, sensor networks, and data storage, building automation now incorporates a large number of data points into every element of building systems. Varying naming conventions and schemas assigned to these elements by different companies and field engineers pose a challenge to identifying relationships between building systems. To solve this issue, we developed a framework for the Building Automation and Control Network (BACnet) data point to establish the physical relationships between building elements. Specifically, this research investigated the relationship between air handling units (AHU) and variable air volume (VAV) terminal units. The framework mainly consists of two methods. Firstly, filtering and Random Forest classification techniques identify the semantic information of data points. This method classifies supply air duct pressure (SADP) data points in AHU and damper position (DP) data points in VAV terminal units with 94.9 percent accuracy. The second method calculates the absolute cross correlations between classified data points, and then associates the relationship by cross correlation results of nine-month profiles. This automated association method results in 79.9 percent accuracy. The suggested framework will help users find AHU-VAV relationships, which could be challenging due to a large number of heterogeneous sensors and sensor networks, and inconsistent and erratic sensor nomenclature in the modern building. INTRODUCTION
Due to the improvement of technology, the building industry has created significant opportunities for energy savings (Salsbury 2004). In particular, the transition from pneumatic to direct digital control in the 1970s enabled the managing of thousands of data points for building operations (Newman et al. 1994). These numerous data points are from a variety of sources (e.g., sensors, actuators, meters, or control parameters). Within the last decades, building researchers have tried to utilize these abundant data. However, these data points have to be mapped for their meaning and relationships. For example, for the proper analysis of a certain data point of VAV terminal unit, we should first recognize where and what it measures (e.g., zone temperature) and which AHU is physically associated with it. However, two main characteristics of buildings make the mapping process difficult. The first is the huge size of the modern building. For example, the Gates Hillman Center (GHC) building on the Carnegie Mellon University campus in Pittsburgh, Pennsylvania has approximately 10,000 BACnet data points. In the GHC building, six AHU supply air to 283 VAV terminal units. Without the information from the equipment, we users must manually infer the meaning and relationships of the data points. The second problem is the lack of interoperability of BAS equipment (Piper 2000). Different field engineers might label different point names when they install or repair this equipment. For instance, they often describe the same VAV discharge temperature data point in two different ways (i.e., “vav room 3000 class room da temp” or “VAV-1 DAT”). This heterogeneity is a major impediment to finding a June Young Park is a Ph.D. student at University of Texas at Austin, Austin, Texas. Bertand Lasternas is a research scientist and Azizan Aziz is a research assistant professor at Carnegie Mellon University, Pittsburgh, Pennsylvania.
generalized naming rule (Bhattacharya et al. 2015). Also, unorganized drawings (e.g., architectural, mechanical, and/or electrical) create a barrier to understanding the relationship between building systems (Korman et al. 2001). The National Institute of Standards and Technology found that the lack of the building interoperability standards wastes $15.8 billion annually in the U.S. (NIST 2004). Because of these unique building characteristics, several standards have been created. The American Society of Heating, Refrigerating, and Air-Conditioning Engineers (ASHRAE) developed the BACnet, a data communication protocol for buildings (ASHRAE 2004). Recently, a group of building automation industry established the Project Haystack, an open source initiative to standardize semantic for building data (Project Haystack 2014). In addition, researchers have tried to use metadata-a set of information that describes the detailed meaning of each data point-to map building data points. By observing the sequence of characters in BAS vendor-given point names, Bhattacharya et al. (2014) developed a substring extraction language for metadata transformation. However, their algorithm requires manual input and has interoperability issues stemming from different vendors’ naming rules. Diverging from this textmining approach, Gao et al. (2015) considered numerical sensor-reading data to construct the metadata with Project Haystack tags. Lastly, Balaji et al. (2015) and Hong et al. (2015) tried to combine both point names and sensor readings to build metadata structures. However, they found that each method has problems related to manual input and low accuracy, respectively. To determine relationships between building systems, Pritoni et al. (2015) used a perturbation technique to infer the association between AHU and VAV terminal units. While useful in some cases, this method is not applicable in many buildings because it requires artificial perturbation of the AHU supply temperature. Notably, both text and numeric data are considered in the frameworks proposed in all these studies. For the text data, every building has its naming rules. Instead of point name itself, we use the information from BACnet protocol, which is more generalized and the leading protocol in BAS industry (Piper 2000). Also the numerical reading data is used to understand the behavior of mechanical equipment, which can be a clue for inferring relationships. Currently, the AHU-VAV terminal unit system comprises the primary portion of commercial building systems in the U.S. (ASHRAE 2013). Thus, we studied the physical association between the AHU and VAV terminal units in the GHC building to establish proof of concept. Figure 1 illustrates the procedures of the framework. Extracting data points from the building, the first step filters out the data points according to BACnet properties. After reducing the number of data points, the daily statistical features are extracted from numerical readings of the filtered data points. By assigning these statistical features to the classification model, the user can identify SADP-AHU and DP-VAV data points. In the third step, the framework calculates the cross-correlation between these two classified signals. Finally, the framework automatically returns the physical association between AHU and VAV terminal units.
Figure 1 The framework procedure
METHODS Pilot study building description
The test-bed for this framework, the GHC building has a gross area of 217,000 square feet and contains 310 offices, 32 laboratories, and 11 conference rooms. The six AHU are located on third floor (AHU3/7/8) and on the roof (AHU9/10/11). The 283 VAV terminal units serve the air on the third to the ninth floors. For the building control, the public spaces require non-stop operation, while the classroom units have fixed schedules. The office units, which comprise the largest portion of the building, are controlled based on occupancy, and each occupant can change the temperature setpoint. With the variable frequency drives, the fans control airflow in the AHU, and the VAV terminal units control both the airflow rate and temperature by damper and reheat. Essentially, DP-VAV is negatively correlated with the SADP-AHU transiently. For example, if the damper opens more to supply more air into the zone, the duct pressure will decrease and cause fan flow to increase. We developed the framework primarily by considering the characteristics of the GHC building, examining the mechanical drawings to determine the ground truth of the physical associations between its six AHU and 283 VAV terminal units. Data point type identification
The aim of data point type identification is to find the needed data points (i.e., SADP-AHU and DP-VAV) from the 10,000 data points. Each data point has eight required properties (ASHRAE 2004). Thus, a user can ascertain eight properties for every one of the data points. Table 1 shows the eight properties of an example data point. For the “object type” property, the inputs (analog/binary input) are data coming into the control panel from the sensor and switch; the outputs (analog/binary output) are data coming out from the control panel to control the actuator; and control parameters (analog/binary value) are pre-assigned or calculated as the values in the control panel. A user can filter data points from any combination of the eight BACnet properties. For example, the pressure sensor data is collected by filtering the object type (analog input) and the units (inch WC).
Property Example Usage
Table 1. The Required Properties for BACnet data point Object Object Object Present Status Event Out of Identifier Name Type Value Flag State Service Analog Analog In AI 01 1.5 Normal False Input #1 Input Alarm
Units Inch WC
Even though we users can reduce the number of data points by filtering by property, some data points could have identical properties. For example, within a single AHU, there are four different pressure sensors that provide analog inputs and are measured by inch WC. To differentiate these sensors from one another, users should deploy the data-driven classification model. Feature extraction and model selection are primary parts of the data-driven model. Feature extraction means selecting informative inputs for the model, and model selection involves choosing the adequate model for classifying the desired output from the chosen features. The hypothesis of this classification is that every data point has unique signal characteristics and patterns that repeat daily. For instance, although two sensors may measure air pressure, because they are in different locations, the statistical characteristics of their signals will differ slightly. Thus, we choose daily statistical features (i.e., median, mean, standard deviation, minimum, maximum, 25-percent quartile, and 75-percent quartile) from five-minute interval samples. For model selection, the framework uses the Random Forest algorithm, which builds decision trees from random samples and evaluates the classification probability from multiple randomly generated decision trees. The Random Forest algorithm is effective when users have a few features and numerous data samples. Also, since the Random Forest algorithm bases its classifications on probabilities from multiple random decision trees, users can avoid the problem of over-fitting. To evaluate error rate of the classification algorithm, we employed out-of-bag error estimation. It is the average error for each sample which
was not used to train the Random Forest classification model. To train the model, we randomly chose daily statistical features of filtered data points in 2015. For classifying SADP-AHU and DP-VAV, 700 and 1,747 samples with manually labeled data point types were chosen. Having drawn seven statistical features from each sample, we can build the feature tables for both (i.e., one table with seven statistical features by 700 samples, and one with seven statistical features by 1,747 samples). From the feature tables and ground truth labels, 100 decision trees were randomly generated. The classification results from the 100 decision trees enabled the calculation of the probability distribution for desired data points. With this trained model from 2015 data, we can finally label the desired data point type (i.e., SADP-AHU and DP-VAV) from new data set. Figures 2 and 3 represent the data point type identification step in detail. A Java-based machine learning tool developed from the University of Waikato, New Zealand called Weka was used for the training and testing (Bouckaert 2015).
Figure 2 BACnet property filtering and feature extraction
Figure 3 Data point type classification by trained model
Mechanical relationship inference
To quantify the correlation between SADP-AHU and DP-VAV, we implemented a cross-correlation equation. The cross-correlation is the dot product of two vectors. (See Equation 1.) Essentially, two signals are represented by two vectors for this calculation. The variables 𝑥 and 𝑦 represent the function of SADP and DP by time, also m and n represent the sampling intervals of this calculation. The result indicates the correlation between two signals with some time lags. The range for the normalized calculation result can vary from the minimum value of -1.0 (negative correlation) to maximum value of 1.0 (positive correlation), and the calculation result 0 means that there is no correlation between the two signals. 𝜑𝑥𝑦 𝑛 =
∞ 𝑚=−∞ 𝑥
𝑚 𝑦 𝑚 + 𝑛
(1)
By computing the cross-correlation between each DP-VAV and the six different SADP-AHU for a one-month profile, the calculation can generate six results. As shown in Figure 4, the method estimates the associated AHU type on the basis of the maximum absolute value. Since the relationship between DP and SADP is a negative correlation, using the maximum absolute value allows for a comparison of the magnitude of the correlation between two signals. However, variation of profile is important for investigating the cross-correlation of two signals, and changing weather conditions would be a trigger for signal variation. Therefore, we designed to run for nine months. Figure 5 shows the extended experiment method. Firstly, nine monthly profiles were collected with five-minute intervals. By iterating the calculation for nine months, we found that a single VAV can select a different AHU type from month to month. Ultimately, the calculation estimates the most frequently selected AHU. In the figure, AHU8 is the most frequent AHU type for VAV_X (selected five out of the nine months); thus the framework associates VAV_X with AHU8. If the multiple AHU types are tied in the estimation result, the AHU with the higher score is used.
Figure 4 Single-month mechanical relationship inference method
Figure 5 Multi-month mechanical relationship inference method
RESULT Data point type identification for the supply air duct pressure
In the GHC building, a single AHU contains over 150 data points. To acquire SADP data point, the data points were filtered by object (analog input) and by unit (inch WC). After filtering, there were only four candidates (i.e., SADP, exhaust fan pressure, supply fan pressure, or enthalpy wheel pressure) for classifying the SADP. To evaluate the model by a new dataset, 236 samples were collected from March 2016. The model classified 222 correctly out of 236 samples (94.06 percent), with low out-of-bag error (0.0375). In Table 2, the horizontal and vertical axes represent
the actual and prediction, and the diagonal values are the number of correctly classified point types by the model. The model correctly classifies the SADP and enthalpy wheel, but it misclassifies some of the exhaust fan and supply fan. However, since the aim of this classification model is identifying the SADP, the model was adequate for the next step. Data point type identification for the damper position
In the GHC building, a single VAV terminal unit contains around 20 data points. After filtering the data points by the object (analog value) and the unit (percentage), the model identified only two candidates for classifying the DP (i.e., DP or reheat valve). To evaluate the trained model by a new dataset, we collected 594 samples from March 2016. The model classified 569 correctly out of 594 samples (95.79 percent), with a low out-of-bag error (0.0567). Table 3 shows that the DP data point was accurately classified from the actual DP data type. However, the number of data points in the evaluation data set was not balanced (DP-502 each, and reheat valve-92 each). This is because the reheat valves are only installed for heating required in VAV zones. Also, the model misclassified 25 data points as DP rather than as reheat valve. Thus, the classification accuracy would be slightly lower for VAV terminal units with reheats. Table 2.
Confusion Matrix For The Supply Air Duct Pressure Classification Classification by model Exhaust fan Supply fan Supply air Enthalpy wheel Exhaust fan 49 6 0 4 Actual data point type Supply fan 4 75 0 0 Supply air 0 0 70 0 Enthalpy wheel 0 0 0 28 Table 3. Confusion Matrix For The Damper Position Classification Classification by model Damper position Reheat valve Actual data point type Damper position 502 0 Reheat valve 25 67 Single month mechanical relationship inference
The model infers the physical association between AHU and VAV terminal units by calculating cross-correlation between the SADP-AHU and the DP-VAV. Table 4 shows normalized cross-correlation results between SADP in AHU and DP in VAV room6113 in January 2016. Figure 6 illustrates that the profiles of SADP-AHU7 and DP-VAV room6113 are negatively correlated transiently (which is the sequence of VAV control in this building). The maximum absolute cross-correlation of AHU7 is 0.8432, and it is the maximum value among other the AHU. Thus, we infer that VAV room6113 is connected with AHU7. After calculating all the combinations of AHU types and VAV terminal units, it inferred the relationship between AHU and VAV terminal units with a 40.5 percent accuracy. Since a single month cannot cover enough variation in seasonal patterns, the dataset had to be extended to multiple months. Table 4. Normalized cross correlations between AHU and VAV room 6113 AHU3 AHU7 AHU8 AHU9 AHU10 AHU11 Maximum absolute values 0.5396 0.8432 0.0802 0.5682 0.2029 0.1055 Multi-month mechanical relationship inference
Using a dataset of nine monthly profiles, the model’s cross-correlation function can consider various seasonal factors. The seasonal characteristics of the profiles are selectively beneficial for estimating AHU type. For example, the peak profile in August 2015 is a good indicator for AHU9; on the other hand, the peak profile in January 2016
provides higher accuracy for AHU11. To selectively estimate the correct AHU from the cross-correlation results, the most frequent AHU type from a nine-month period is estimated as the final outcome. Table 5 shows the relationship inference result that, while the overall accuracy increases to 79.9 percent, the relationship between AHU11 and VAV terminal units is not as strong as the other associations. This finding is due to the fact that the VAV terminal units served from AHU11 are located in various thermal characteristic zones run by different programs and orientations. Also, AHU11 serves the second largest number of VAV terminal units (73 each), and the AHU11 distributes the air throughout five different floors (from the fifth to the ninth floors). This means the vertical and horizontal distance between VAV terminal units and AHU11 is greater than for the other AHU.
Figure 6 Signal profiles of AHU7 and VAV_room6113 in January 2016 Table 5. AHU-VAV Association Inference Result Assigned AHU AHU3 AHU7 AHU8 AHU9 AHU10 Correctly matched VAV terminal units 3/3 4/5 45/53 56/65 72/84 Accuracy (%) 100 80 84.9 86.1 85.7
AHU11 46/73 63
Total 226/283 79.9
DISCUSSION
The overall performance of the suggested framework shows that it is acceptable for future applications. Especially, the BACnet information filtering reduces the number of data points. This filtering process enables deeper investigation of specific data points. Diverging from other research approaches, the proposed classification models only aim to identify SADP-AHU and DP-VAV. This study found that the model had high accuracy in the GHC building and any building can be tested, as long as its data is stored and controlled under the BACnet protocol. For the mechanical relationship inference step, the framework achieves 79.9 percent accuracy in associating AHU and VAV terminal units. Moreover, a primary advantage of this framework compared to the manual inspection is that the model’s running time is under five minutes. Similarly, Pritoni et al. (2015) obtain 80 percent accuracy for associating AHU and VAV terminal units, and the association could be earned through monitoring VAV flow rate by turning off AHU. However, the main difference is that the suggested framework is the least intrusive method among them.
However, the framework has limitations. In the data point type identification step, it is limited to filtering data point types listed under other BAS communication protocols (e.g., LonWorks, Modbus). We trained this classification model from data from the GHC building in 2015, and then evaluated this trained model only for the same building in 2016. This is one of the reasons that the classification model achieves 94.9 percent accuracy. To establish the generalizability of the framework, we have to evaluate the classification model, using data from other buildings. CONCLUSION
With improvements in technology, modern buildings are improving in their ability to acquire and store a large stream of data. Leveraging building data is a wise solution to understand the complex building systems. The proposed framework automatically associates the physical relationship between AHU and VAV terminal units, using BACnet information and a historical data set. Obviously, this approach works significantly faster than the manual inspection process and serves as a backup solution for when a user does not have updated mechanical drawings. Even though the suggested framework is limited to determining the relationship of AHU-VAV terminal units and has only been deployed in one building, this research work stands as a stepping stone for understanding the heterogeneity of building systems and equipment by utilizing a large stream of data. REFERENCES
ASHRAE. 2004. ASHRAE Standard 135, BACnet—a data communication protocol for building automation and control networks. Atlanta: ASHRAE ASHRAE. 2013. ASHRAE Handbook—Fundamentals. Atlanta: ASHRAE. Balaji, Bharathan, Chetan Verma, Balakrishnan Narayanaswamy, and Yuvraj Agarwal. 2015. Zodiac: Organizing Large Deployment of Sensors to Create Reusable Applications for Buildings. In Proceedings of the 2nd ACM International Conference on Embedded Systems for Energy-Efficient Built Environments 13-22. Bhattacharya, Arka, Joern Ploennings, and David E. Culler. 2015. Short Paper: Analyzing Metadata Schemas for Buildings: The Good, the Bad, and the Ugly. In Proceedings of the 2nd ACM International Conference on Embedded Systems for EnergyEfficient Built Environments 33-34. Bhattacharya, Arka, David E. Culler, Jorge Ortiz, Dezhi Hong, and Kamin Whitehouse. 2014. Enabling portable building applications through automated metadata transformation. EECS Department, University of California, Berkeley Technical Report UCB/EECS-2014-159. Bouckaert, Remco R., Eibe Frank, Mark Hall, Richard Kirkby, Peter Reutemann, Alex Seewald, and David Scuse. 2015. WEKA manual for version 3-7-12. Hamilton: New Zealand Gao, Jingkun, Joern Ploennigs, and Mario Berges. 2015. A Data-driven Meta-data Inference Framework for Building Automation Systems. In Proceedings of the 2nd ACM International Conference on Embedded Systems for Energy-Efficient Built Environments 23-32. Hong, Dezhi, Hongning Wang, Jorge Ortiz, and Kamin Whitehouse. 2015. The Building Adapter: Towards Quickly Applying Building Analytics at Scale. In Proceedings of the 2nd ACM International Conference on Embedded Systems for EnergyEfficient Built Environments 123-32. Korman, Thomas M., and C. B. Tatum. 2001. Development of a knowledge-based system to improve mechanical, electrical, and plumbing coordination. CIFE, Stanford University Technical Report Nr 129. Newman, H. Michael, and Morton Dan Morris. 1994. Direct digital control of building systems: theory and practice. John Wiley & Sons. NIST. 2004. Cost analysis of inadequate interoperability in the US capital facilities industry. Gaithersburg: NIST. Piper, J. 2000. Finding a path to interoperability. Building Operating Management 47(8) Pritoni, Marco, Arka A. Bhattacharya, David Culler, and Mark Modera. 2015. Short Paper: A Method for Discovering Functional Relationships Between Air Handling Units and Variable-Air-Volume Boxes From Sensor Data. In Proceedings of the 2nd ACM International Conference on Embedded Systems for Energy-Efficient Built Environments 133-36. Project Haystack. 2014. Project Haystack. http://project-haystack.org/. Salsbury, Timothy I. 2005. A survey of control technologies in the building automation industry. IFAC Proceedings 38(1):90100.