Energy performance optimization in buildings : A review on semantic interoperability , fault detection , and predictive control

The building sector accounts for about 30% of the global final energy consumption. Most of the consumed energy originates from fossil fuels. The operation of buildings is known to suffer from various deficiencies, degrading their energy performance. An untapped potential lies, therefore, in the optimization of building operation to significantly reduce CO2 emissions and to increase the cost effectiveness and user comfort. Over the past 40 years, extensive research has been carried out to investigate and develop methods for building performance optimization based on measured data from building services, such as heating, ventilation, air conditioning, and lighting systems. The ongoing digitalization trend in the building sector offers the opportunity to easily access large amounts of high-quality measurement data and semantic building information as digital descriptions. This facilitates the development and implementation of automated routines for the continuous supervision and optimization of building operation, including reliable fault detection and diagnosis and model-predictive control. This review article is focused on three major research topics in the field of energy-efficient buildings, namely, semantic interoperability between heterogeneous and complex systems, methods for fault detection and diagnosis, and model-predictive control.

Today, buildings are responsible for about 30% of the global final energy consumption. 1In the future, the energy consumption of the building sector is expected to increase further due to population growth and higher levels of mechanization in buildings. 2][4] A high level of mechanization in buildings and an increasing amount of sensor data from building services also offers the potential to operate buildings highly efficiently and thus limit energy wastage, where, at the same time, indoor comfort is maintained.This is particularly relevant for large commercial buildings due to their high energy demand.The exploitation of this potential has been the subject of intensive research during the past decades.
Related research activities aim at optimizing the building energy performance by supervising operation, detecting faults, and optimizing controls of building services.The approaches investigated range from mathematical modeling of the physical relationships through statistical descriptions to methods from data mining and machine learning.Despite significant advances in these domains, partly due to major international initiatives (e.g., Annexes 25, 34, 40, and 47 within the Energy in Buildings and Communities Programme of the International Energy Agency), the adoption of advanced methods for building energy performance optimization in the architecture, engineering, and construction (AEC) industry is lagging behind the demonstrated potential.This is partly due to the heterogeneity of systems and data along with information deficits, which hinder cost-efficient implementation of these methods.
The standardization of information exchange formats and communication protocols as well as the ongoing digitization of the building sector make it feasible to automate time-consuming and fault-prone manual processes.6][7][8] These advances can facilitate the replicability and scalability of fault detection and diagnosis (FDD) and model predictive control (MPC), enabling widespread deployment.
In this article, we review literature related to energy performance optimization in buildings.Figure 1 displays a schematic overview of the optimization process, as is described here.The process starts with the retrieval of data from a building automation system (BAS), providing semantic information and time series data from sensors and actuators in the field.Additionally, methods for building performance optimization have access to data from heterogeneous sources like BIM or the world-wide web.Information modeling and interoperability between heterogeneous systems present in buildings are reviewed in Sec.II.
Depending on the specific application and the source of the data, different preprocessing steps are needed.Examples for simple preprocessing of sensor data are the treatment of missing data, data aggregation over certain time intervals, limit checking, or unit conversions.Semantic data might have to be checked for consistency or transformed into a given data format in a preprocessing step.Although data preprocessing represents a crucial part of the whole process, it has minor scientific content and is therefore not reviewed here in detail.
Within the methods for building performance optimization, one can distinguish between FDD and MPC.FDD implies, as a first step, a continuous supervision of the system behavior (monitoring).In the simplest case, special plots are automatically generated, which allow quick (manual) assessment of building services operation.Advanced methods for automated FDD, which provide a deeper analysis of the building data and reveal faults and suboptimal operation, are treated in Sec.III.Additional information about the cause of a fault, its impact, and recommendations for its elimination are given to the facility manager (FM) or corrective actions can be directly fed back to the system via communication protocols.Predictive maintenance is one application for FDD methods, which aims at detecting a deterioration of the system prior to the manifestation of a fault.
Provided a fault-free system operation, the performance can be further increased by optimizing the controls.Therefore, a model of the system, measurement data, and forecasts are used in an MPC routine (see Sec. IV) to predict optimal control sequences and feed them back to the system.While in the whole process, uncertainties are present due to missing or incorrect data and unknown weather conditions and user behavior; here, the effect and treatment of uncertainties are only addressed in the context of MPC.
The parts of the whole process which are investigated in detail in Secs.II-IV are highlighted in Fig. 1.

II. SEMANTIC INTEROPERABILITY IN BUILDING OPERATION
There is a clear perception that the construction and FM sector can benefit from the availability of large amounts of data and of advanced analytical and optimization tools, especially if different knowledge domains like BIM and BAS are advantageously integrated. 9,10For applications like FDD and MPC presented in Secs.III and IV, BAS are a crucial source of information, not only because they provide measurements in the form of time series data but also because they contain semantic information on the sensors and devices, allowing them to be automatically identified and their data to be processed.The efficient management of these information sources is a key enabler for the application of advanced methods.Currently, the information in BAS is structured and exchanged in a large variety of ways, for example, via the canonical protocols BACnet or KNX.Each protocol defines a data model, mainly structured according to the object-oriented paradigm, which organizes the information of building services, devices, and methods to represent and exchange it in a network-visible way.
Ideally, standardized information models, in which the data have a unified structure and meaning, should represent the basis to interoperate heterogeneous applications and to manage information over the entire life cycle of buildings.In fact, the structure and the semantic representation of information in buildings are very heterogeneous and suffer from several deficits: the information is patchy, is scattered over several applications and media, and is partly duplicated and inconsistently managed.Furthermore, in many cases, the information is difficult to access and only by recourse to specific interfaces, requiring specialized know-how and the FIG. 1. Schematic overview of the described process of building performance optimization.The parts of the whole process which are addressed in detail in this article are highlighted.

041501-2
Benndorf, Wystrcil, and R ehault Appl.Phys.Rev. 5, 041501 (2018)     translation of information from one model structure to another one.This situation will probably continue for many years due to the legacy of systems in place and the long lifetime of buildings.Balaji et al. described the current situation as being dominated by the reign of "cacophony of data and information." 11In most cases, the capture of semantic information as well as the mapping process between information from legacy BAS and standardized information models is realized manually and requires extensive expertise on the analyzed building structure and services.
The need for semantic interoperability in buildings has risen to a widely accepted priority in recent years, also due to the emergence of connected devices like sensors, power meters, and lights supporting the Internet of Things (IoT) in modern buildings.Several initiatives and studies have been undertaken to develop solutions that aim at overcoming these issues.They range from the development of standardized and open communication protocols to the integration of different information models by more advanced approaches using the potentials of semantic web technology.In this section, we review the current state of the art in scientific literature on the topic of semantic interoperability in buildings.We propose the classification of research activities into three categories, as shown in Fig. 2: semantic interoperability within BAS (Sec.II A), between BAS protocols and BIM (Sec.II B), and approaches using semantic web technology (Sec.II C).
A. Semantic interoperability in building automation systems BAS 12 are playing a growing role in commercial buildings, where they aim to ensure good indoor environmental conditions for the building users while maintaining high energy efficiency and low operating and maintenance costs.BAS are distributed systems designed for the computerized control and management of building services like HVAC systems and interconnect devices like sensors, actuators, programmable logic controllers (PLC), personal computers, etc., through wired and wireless field buses and networks. 13,14This infrastructure enables the exchange of information between devices and the execution of complex control and supervision tasks.The traditional architecture of BAS is often represented as a three-layer architecture (see Fig. 2): the field layer includes sensors, actuators, and controllers interconnected via field buses like KNX, LON, or wireless networks like ZigBee or Z-Wave.The automation layer consists of PLCs covering measurement processing, control, and alarm tasks for the devices of the field layer and uses protocols of both the field and the management layer. 15,16The management layer forms the upper tier of the architecture and is constituted of supervisory control systems (SCS), human-machine interfaces (HMI) with configuration and monitoring features, as well as databases for time series data archival (DBs).Typical protocols of the management layer are BACnet or OPC.
Information modeling in BAS mainly addresses the representation of device properties containing the meta data associated with specific information (e.g., device identification, encoding, control, signal processing, or alarm requirements).For example, BACnet models modulate a two-way valve as an analog output object with values ranging from 0% (closed) to 100% (fully open).This information is useful for third-party applications like FDD or MPC using the valve signal to detect faults or to optimize its control.However, Akin points out that BAS provide large amounts of information that are not useful for applications and that appropriate culling procedures are required. 17

041501-3
Benndorf, Wystrcil, and R ehault Appl.Phys.Rev. 5, 041501 (2018)     provide an explicit semantic description of the system structure (i.e., dependence between systems, relative topological position of devices, etc.), detailed technical device information or control strategies. 18To support the parametrization of some modeling approaches or of some FDD methods as well as to provide context information to the end user in the case a fault is detected, additional information, such as location, type, manufacturer, and technical characteristics, are necessary.
For the designation of data points and devices in BAS, guidelines like ISO 16484 recommend the use of hierarchically structured naming systems that specify the building name, the floor, or the provided service as well as meta data like units and preprocessing principles. 16Applied to BAS of complex facilities, such naming systems enable the unambiguous identification of information by humans and machines and facilitate the implementation of third-party applications like FDD. 18 BIM can be an important source for this kind of information.Nevertheless, due to the lack of appropriate standards in this domain and the fact that, today, only very few buildings hold a BIM, when analyzing BAS data of legacy systems, the first task is to identify the meaning of data points names and the mapping to a unified scheme.A way to support this process is to identify and to classify BAS data automatically.3][24][25] SOA add an additional layer to the traditional architecture of BAS that allows web-based management and integration of modular applications like weather services, analytics, or demand-side management.Available standards using web services are, e.g., BACnet/ WS or the platform-independent open standard open building information exchange (oBIX). 16,26,27These standards enable the information exchange from machine to machine by using, e.g., simple object access protocol (SOAP) or representational state transfer (REST) interfaces via hypertext transfer protocol (HTTP).
Semantic interoperability in BAS addresses two main challenges: to facilitate communication between different protocols used at the different layers inside BAS and to provide information on BAS devices for third-party applications.
Within BAS, dedicated routers provide gateway functions to interconnect between layers or networks using different protocols. 13The main concern is to make the devices addressable between the different networks via specific procedures that translate the information from one protocol to another.For example, the ISO 16484 standard contains a clause describing the mapping modalities from BACnet objects and properties to corresponding KNX datapoints and functional blocks. 16Nevertheless, although gateway solutions are widely available on the market, the translation of information between protocols is not fully standardized and scientific literature or guidelines on this topic are rare.This leads in some cases to additional programming efforts that generate additional costs and to interoperability issues that degrade the correct functioning of building services.
The evolution of BAS to a service-oriented architecture has raised the interest of the building automation community for solutions like oBIX and the OPC unified architecture (OPC UA).Neugschwandtner et al. investigated the translation of a KNX network into oBIX to enable an oBIX client to pull data from a KNX installation and to control KNX devices. 28While they advocated its use in the gateway design, they emphasized the high implementation costs due to the complexity of the oBIX standard and recommended a benchmark with approaches using BACnet/WS or OPC UA.Later, Kastner et al. used the BACnet/WS data model to map information from a KNX network and dispose it via web services. 26Although they demonstrated the capacity of BACnet/WS to cope with non-BACnet protocols, their approach is hampered by high mapping efforts.Fernbach et al. proposed an approach to model BACnet information in OPC UA by transforming BACnet objects and their respective properties into OPC UA complex objects. 29In a similar way, Granzer et al. presented an information modeling method that adds a domain-specific information model implemented in communication protocols like BACnet, KNX, LONWorks, and ZigBee into OPC UA.Furthermore, they leveraged the rich set of services, the generic, comprehensive, and extensible information model of OPC UA for the use of web services. 30Cavalieri et al. stated that the main limitations to the interoperability in buildings are present at the information level. 31To overcome this, based on initial efforts to integrate KNX with the OPC UA information model, they proposed an approach to represent KNX functions and data structures in an OPC UA information model.In 2015, Schachinger et al. established a concept to interoperate BAS on the basis of RESTful BACnet/WS. 32he following year, the BIG-EU and the OPC Foundation released the report "OPC UA Information Model for BACnet" that defined an OPC UA Information Model to represent the BACnet architectural models with the objective of enhancing the integration of BAS networks at the enterprise level. 33

B. Building information modeling
In the construction sector, the BIM method is being widely discussed as the solution to the problem of information management during a building's entire life-cycle.BIM enables consistent data storage and management of information in a unique model as well as seamless data exchange between different actors and software tools on the basis of standardized schemas like the industry foundation classes (IFC), green building XML (gbXML), or construction operations building information exchange (COBie). 34Furthermore, BIM has the potential to model BAS devices and functions in a standardized way and to improve the semantic interoperability between systems by providing a common information model.Whereas the construction industry has developed manifold BIM approaches and tools since the first conceptual idea in the early 1970s, the concrete implementation of BIM in the building sector is a recent trend with different maturity levels across industrial countries. 35The benefits of BIM in terms of improving productivity and quality in the design and construction phases of buildings have been demonstrated in several pilot projects and are now undeniable.The use of BIM as a centralized information source for numerous FM tasks offers clear advantages for commissioning and in the operation phase of buildings. 36,37After the handover phase, facility managers can refer to the BIM, which contains all information on construction, to support operation and maintenance of the facilities including the BAS.This can improve the current situation that is characterized by a critical loss of information between the design and commissioning phases, due to non-standardized methods, and processes for the exchange of information. 38Consequences are flaws and faults that appear after BAS have been commissioned or during their service operation that degrade indoor comfort and energy performance of buildings.Bercerik-Gerber et al. identified potential application areas and data requirements for the use of BIM in FM and emphasized the fact that interoperability issues between BIM and applications like computeraided facility management (CAFM), energy management systems (EMS), and BAS are a prerequisite to leverage synergy between these applications. 39Furthermore, they proposed a structure for non-geometric data requirements in FM that a BIM could provide and described different use cases where BAS could benefit from the information of a BIM, like localization of equipment and of served areas, sensor information for set-point verification, and control strategies.However, less research has been dedicated to this topic than that to the use of BIM in the design and construction phase.
After this brief introduction to BIM, we review different articles that aim at enabling the description of BAS network structures, building services, and control strategies in BIM with the objective of streamlining the information exchange between design, commissioning, and processes of the operation phase.Karavan et al. and Malinowsky et al. investigated the integration of BAS network structures into construction projects using BIM.The first group of researchers proposed a method that links a LonWork Network Service (LNS) model representing the structure of a LON automation network with a BIM based on IFC. 40The objective is to interoperate between LON and IFC and to gain added value from data of the LON protocol in building management applications.By using the same technologies of LON and IFC, Malinowsky et al. demonstrated three possibilities to map process data of LonMark standard functional profiles and standard network variable types onto an IFC model. 41,42The next field of interest for the integration of BAS and BIM is the use of semantic information to enhance the scalability and the adaptability of third-party applications.In this matter, Provan et al. developed a generic meta model using BIM information to automatically generate FDD rules for specific building services and to define their respective thresholds. 43n a similar way, Dong et al. tested a hybrid FDD method using information from a BIM-based infrastructure integrating an IFC model with a data acquisition system based on BACnet. 18In an attempt to close the feedback loop between operation and design phases and to provide an assessment of the energy performance of buildings, Oti et al. developed a similar approach using a .NET-framework that incorporates BAS data into a BIM environment. 10sides the digital modeling of BAS networks and building services in BIM, the description of control strategies for building services in BIM enables long-term information storage, management, and access in the operation phase.The current IFC version provides modeling capabilities for sensors, actuators, and controllers as well as the possibility to describe control mechanisms in IFC. 44The latter was investigated in Benndorf et al. 45 for three simple control strategies of HVAC systems.Furthermore, to enable the information exchange with BAS over semantic web technologies, the authors converted the extended IFC file by use of the emerging ifcOWL ontology. 46,47

C. Semantic web technologies
The mutation of BAS over the last 20 years to web-based solutions sharing information with heterogeneous applications has created the urgent need for the elaboration of common vocabularies and taxonomies that provide semantic interoperability.In this context, semantic web technologies can significantly facilitate information exchange, as they can cope with heterogeneous data, support interoperability across diverse knowledge domains, integrate distributed data, and apply inference to extract new knowledge from this data. 48To overcome the fragmentation of information in BAS, several research teams have investigated the use of ontological models that allow expressing the syntax and the semantics of objects as well as their relationships in a formal and declarative way.Ontologies conforming to the rules of the World Wide Web Consortium are described through the resource description framework schema (RDFS) that provides a vocabulary using the RDF data model.Knowledge on objects and relationships is expressed as semantic triples of the form subject predicate object.Based on RDFS, the web ontology language (OWL) allows the formulation of complex ontologies.In an initiative to create a recommendable OWL representation from the EXPRESS schema of IFC and thus to simplify the access to IFC information via semantic web technologies, Terkaj et al. and Pauwels et al. developed the ifcOWL ontology as well as an EXPRESS-to-OWL converter. 47,49he Project Haystack developed a relevant meta data schema for BAS with the objective of facilitating the deployment of IoT technologies in buildings by standardizing semantic data models through the definition of tagging models, data formats, and data structures for building services like HVAC systems on the one hand and data exchange mechanisms through web services using REST APIs on the other. 50A further modeling approach is the smart appliance REFerence ontology (SAREF), which provides a common architecture and semantic interoperability to sensors and devices using different assets (like, e.g., EnOcean, KNX, or Z-Wave). 51Notable approaches using ontologies to capture, structure, and provide knowledge to applications linked with BAS have been realized by Dibowski et al. and Ploennigs et al. with the development of appropriate ontology models for the design of devices in BAS. 52,53Lee et al., 54 Tomasevic et al., 55 Ploennigs et al., 56 and Delgoshaei et al. 5 developed further approaches based on ontologies linked with BAS, EMS, and FDD applications.Bhattacharya et al.
compared the ability of the Haystack schema with the IFC and the SAREF ontology to describe data points exhaustively in BAS, to capture relationships between data points, to express meta data uncertainty, and to include emerging concepts like new sensor technologies. 51,57They demonstrated that none of the investigated schemas is appropriate to fulfill these requirements.Based on these conclusions, they developed a concept combining expert knowledge and machine learning to identify non-standardized sensor names and to translate them into a common namespace based on the taxonomy of the Project Haystack. 50,58They showed that their synthesis technique needs only a few examples to identify the most commonly occurring sensors required for a fault detection application and that it is robust against ambiguous and noisy tags.Balaji et al. emphasized the fact that the scalability of third-party applications is hindered by a lack of a common data representation, as a mapping of the heterogeneous data to a common format is required for each building. 11To overcome this, they designed a schema called Brick describing the data points contained in BAS together with their meta data and relationships.The data format and query language of Brick adheres to the RDF data model and the authors applied the concept of tags from the Project Haystack 59 in the Brick ontology to describe building meta data by means of hierarchies, relationships, and properties.Additionally, they showed how Brick can describe a set of entities and relationships in buildings that are useful to a range of eight third-party applications including FDD and MPC.Nevertheless, although they identified the conversion of legacy meta data to Brick as a future challenge requiring automated mapping techniques, they give no indication on the communication protocols encountered in buildings and how they realized the mapping from the existing different meta data sets to Brick.Furthermore, they acknowledge that the manual and cost-intensive capture of scattered and nondigitized information on the equipment types and relationships remains a technical and organizational barrier for the application of the schema.

D. Conclusion
The presented review gives an overview of recent developments to overcome the current interoperability issues in BAS by creating links between the various knowledge domains.Despite several standardization efforts for the use of BIM in FM, the adoption of this method in the industry is still in its infancy. 10,60Data models are missing or are insufficient and there is a lack of interoperability between software tools like BIM, BAS, and CAFM software.There is a need for additional research, development, and standardization in this domain to facilitate the adoption of solutions based on digital models for the design, the commissioning, and the operation of BAS.Furthermore, since the majority of the legacy BAS in place do not provide all the emerging and useful features like open protocols, web services capabilities and BIM, the AEC and FM sectors have to cope with this heterogeneity if they want to harness the efficiency and carbon dioxide emission reduction potentials in existing buildings by using energy efficiency applications like FDD and MPC.To this, they have to promote new standards and approaches that take this situation into account and enable practicable and efficient interoperability between legacy systems and innovative solutions.In this context, the American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE) committee in charge of the BACnet standard announced recently its intention to investigate the integration of Haystack tagging and Brick data modeling concepts into a new semantic standard for building data. 61he results of this standardization efforts might contribute to tackle the interoperability issues in the AEC and FM sectors.

III. FAULT DETECTION AND DIAGNOSIS
This section provides an overview of current research concerning FDD in building operation and indicates future directions in the field.It has been widely recognized that the building sector lags behind other industries where automated FDD is critical, such as aerospace, nuclear power plants, and chemical and manufacturing processes. 62Since early works in the 1990s, [63][64][65][66][67] there has been increasing interest in automating FDD in building applications. 68,69Although a considerable amount of research has been conducted by now, the implementation of automated FDD in practice is still relatively sparse. 70omprehensive reviews cover recent advances of FDD in industrial processes [71][72][73] and in building systems. 4,70,74,75n the following, these articles are carefully condensed and complemented by a stronger focus on machine learning methods, the quantitative performance evaluation of these methods, and their applicability to fault diagnosis.

A. Methods
The whole task of FDD can be separated into two steps, a detection step and a diagnosis step: In the detection step, a fault is identified (for a certain time interval).In the diagnosis step, the detected fault is characterized in more detail by specifying the involved sensors, components, or subsystems and possibly naming a specific reason for the fault.In the following, the two parts of FDD are described separately, as they can be treated as independent problems.

Fault detection
The methods which can be applied to detect faults in building systems are diverse.Accordingly, various classification schemes for these methods have been suggested. 4,71,76hile Isermann 76 adopts a perspective from signal processing and provides quite detailed classification, Venkatasubramanian et al. 71 and Katipamula and Brambley 4 take a more general view and subsume methods for fault detection and for fault diagnosis.
Here, a simple classification scheme is introduced, where each method is characterized by a group (A or B) and a type (i, ii, or iii).On the one hand, methods are characterized by the nature of the model on which they are based, i.e., methods which are based on prediction models (type i) are distinguished from those which are based on classification models (type ii) and those which are based on outlier detection methods (type iii). 77On the other hand, methods are characterized by the nature of information they require for setup, i.e., methods which are based on expert knowledge (group A) are distinguished from those which are based on measurement data (group B).The resulting categories are represented in Table I.
Methods of type i are based on a prediction model, which is designed to represent the nominal (fault-free) operation and typically provides a continuous value as an output.The difference between the predicted output value and the actual measurement gives a residual which is used to determine a fault condition.9][80][81][82][83] There is no clearly defined method to determine the threshold condition, but a viable solution has to be found for every specific case.As an asset, the size of the residual provides a probability estimate for a fault.
Methods of type ii are based on a classification model which directly provides a binary result, corresponding to a fault-free or faulty condition.With these methods, there is no need to define threshold conditions as for methods of type i.However, of course, there is also no direct information about the probability of a fault.A method which is based on a classification model can often be extended to accommodate several fault types and thus provide diagnoses readily. 84,85A drawback is that prior knowledge about faults has to be provided.
Methods of type iii are based on outlier detection models.These models generate binary results corresponding to fault-free or faulty.In contrast to classification models, they can be set up without prior knowledge about faults, just taking fault-free operation into account.In this respect, outlier detection models fall between prediction models (type i) and classification models (type ii).
Methods from group A are solely based on expert knowledge.This means that detailed information about the considered system has to be available and processed in terms of a model which is then used for fault detection.It is clear from the nature of these methods that they are difficult and time-consuming to set up, very specific to a certain application, and hardly adaptable to changing boundary conditions.However, for well-defined problems, such as applications in air-handling units 84 or compression systems, 85 methods based on expert knowledge have been successfully tested and implemented.A major benefit of this type of methods is that they can be well understood, as they are based on physical relationships and logical control mechanisms.
Methods from group B are based on measurement data from the considered system.This means that the respective model is trained on historical process data from the system and thus predicts the occurrence of a fault according to the patterns learned from the training data.These methods can be set up with little knowledge about the system under consideration, provided that large amounts of measurement data are available.For this reason, measurement-based methods are highly attractive and frequently employed, not only in the field of building performance optimization. 77evertheless, there are well-known drawbacks for this type of methods, namely, the need for labeled training data (i.e., data, which is known to correspond either to correct operation or to faulty operation) which actually also requires expert knowledge beforehand.Furthermore, the resulting models are typically so-called black-box models where the internal structure does not necessarily correspond to the behavior of the real system and therefore can hardly be understood.An exception in this respect is decision tree models, as they provide a set of rules which can be checked for plausibility.Finally, another well-known issue of methods from group B is the limited applicability of these models beyond the conditions for which they were set up (i.e., trained).
Naturally, there are methods which cannot be classified unambiguously into one of the six described categories.For example, so-called gray-box models 86,87 are based on information both from expert knowledge (group A) and from measurement data (group B).Furthermore, some methods from machine learning, like decision trees or neural networks, can be used either for prediction (type i) or for classification (type ii).Nevertheless, the classification scheme introduced here covers most of the cases found in the literature and highlights the most distinctive characteristics of the different methods.
In the following, examples of applications, advantages, and disadvantages are described for each of the mentioned categories.
Expert knowledge can be used to set up a mathematical model representing the real physical relationships of the considered system.Such a method falls into category A.i.9][90][91] Before being used for FDD, these models require initial calibration which is often hindered in the current applications by a lack of high-quality measurement data.A great advantage of simulation models is, however, that faults can be artificially implemented.5][96] Additionally, the models can easily be reused for other applications like MPC, as the model structure corresponds to the physical system.
Expert knowledge can also be used to formulate rules.Such IF-THEN rules typically define physical limits for single signals (limit checking) or correspond to known faults of a given system.A comparison of the predefined set of rules with measurement data allows faults to be detected.Depending on the formulation of the rule set, this method belongs to category A.ii or A.iii.A set of rules can be intuitively applied, which makes this method very attractive for many industrial applications.In building performance optimization, rule sets have been defined and employed for air-handling units (AHU) 84 and vapor compression systems. 64,85,97In a broader sense, the preprocessing of measurement data for use with any model or application is often done via simple rules.Moreover, a properly implemented rule-based system can directly provide a diagnosis.Despite the deficiencies inherent to all expert systems, rule-based system are frequently used to complement other methods in FDD routines. 63,98,99ommon methods which provide predictions based on measurement data are, for example, regression models and neural networks.These methods belong to category B.i. Applications for fault detection in buildings have employed regression models, 100 neural networks, 82,99,101,102 autoregressive models, 78,[102][103][104] or qualitative models. 92,105The required training data must contain only data of one class, corresponding to fault-free operation.Although this makes it relatively easy to provide training data for these methods, it means that methods of category B.i are prone to detect many false positive events.Because the training data set contains only fault-free data, all data differing from the previously seen type is interpreted as faulty.This makes appropriate training of these methods very difficult and ideally demands training data from all fault-free conditions.Moreover, methods based on prediction models need additional information about the measurement data, namely, those which should be treated as inputs and those which should be treated as outputs of the model.Strictly speaking, this is also expert knowledge.
Common methods for classification, belonging to category B.ii, are, for example, decision trees (and their ensemble counterparts), support vector machines (SVM), or Bayesian classifiers. 80,91,106In contrast to methods from category B.i, the training data for these methods has to contain both fault-free and faulty data.This makes the application of such methods difficult, because a fault has to occur in the system before it can be identified by the method.This also implies that no novel faults can be detected.However, if the required training data is available, this type of method performs very well 107 and can be extended to several fault types, which then allows for diagnosis.
Methods for outlier detection can be based on clustering methods or principal component analysis (PCA).These methods are typically trained on fault-free data and identify those data points as outliers (i.e., faulty) which differ significantly from the remaining ones.These methods are assigned to category B.iii.Similar to category B.i, these methods are not only able to detect novel faults but also tend to generate a relatively high number of false positives.Like those from category B.ii, these methods provide binary outputs and treat all measurements equally as inputs.Applications of methods from this category to fault detection in buildings include PCA, 96,108 clustering methods, 93 or statistical evaluation. 109ao et al. 110 compare a PCA-based method to a method based on support vector data description (SVDD), which is similar to a density-based clustering approach.

Fault diagnosis
The second step of the whole FDD process, the diagnosis step, comprises diverse approaches and cannot easily be subsumed in a few categories.However, a general principle is that in order to perform fault diagnosis, some kind of information has to be provided about possible faults in the system.Similar to groups A and B above, this information can either be available in the form of (1) explicit expert knowledge or (2) measurement data, which is known to belong to certain faulty operation (implicit expert knowledge).
Explicit expert knowledge can be provided as expert rules, where each rule is related to a certain fault. 84Violation of a specific rule thus directly gives a corresponding diagnosis.A similar procedure can be applied using fuzzy rule sets, where the respective fuzzy rules are retrieved from simulated faulty operational data. 111Other approaches employ predefined sign patterns. 63,64,98,112These patterns associate the directions of deviations of measurements from the expected values, i.e., the signs of the residuals, to certain fault types.Residuals can hereby be generated, for example, via performance indexes from first principles, 83 statistical methods, 63 regression models, 64,112 or neural networks. 98,99iven implicit expert knowledge about faults, i.e., measurement data that is labeled as faulty, one can train a fault model for each fault, using the provided faulty training data.Fundamentally, such fault models can be based on prediction or outlier detection (compare categories B.i and B.iii).Then, in the case of a detected fault, residuals have to be analyzed for each fault model in order to give a diagnosis.Najafi et al. conducted a study employing a Bayesian network approach and underlying fault models. 113If a classification model (category B.ii) is generated using the available data corresponding to several fault types, the output of the model directly yields a specific fault type.This approach was reported for decision trees. 80Often, a combination of two methods is used.For example, Du et al. used a neural network for fault detection and a clustering method for diagnosis. 101Yan et al. compared different combinations of an autoregressive model for residual generation and SVM and neural networks for fault diagnosis. 103Multiple simultaneous faults were addressed with an extension of SVM to multiple classes. 114ecause fault diagnosis strongly relies on expert knowledge, it is hard to automate and scale this task.A knowledge base or a database, with training data representing normal and faulty operation for typical system configurations and individual components, would be necessary to train and evaluate methods for fault diagnosis.Many studies use fault data from simulations to train and test FDD methods.There are, however, only a few examples where models trained with simulation data were successfully integrated into real-world applications. 111This is probably due to the difficulty of calibrating simulation models with normal and faulty operational data and transferring the models designed for one system to another system. 80Sterling et al. suggested a way out of this dilemma by using qualitative simulation models, which provide a greater degree of scalability. 115ome approaches assign fault responsibilities to individual sensors. 93,101This means that an indicator for each sensor used for FDD is calculated, which estimates the probability that this sensor is responsible for a detected fault.From these probabilities, the affected subsystem or even possible fault types can be inferred.Although this does not provide any detailed diagnosis, such methods can be employed without prior knowledge about faults.

B. Performance evaluation
As described in Sec.III A, extensive literature has been published on different methods for FDD applied to building operation.Nevertheless, it is hard to tell from the reviewed articles which type of method is favorable.This is partly due to the vast diversity of different subsystems in the field.There is also a lack of comparative studies which evaluate the performance of different methods using the same sample data set.Exceptions are an early publication by House et al., 107 where several methods are employed, and a study by Peitsman and Bakker, 102 where autoregressive models and neural network models are compared.The American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE) launched a research project with the aim of developing tools for the evaluation of FDD methods for chillers. 116Within this project, experimental data for normal and faulty behavior of a chiller was produced 95 and subsequently used for testing different FDD methods. 83,103,110,112,117,118ome articles report a comparison of different methods based on the experimental chiller data. 103,110,118However, no initiative has been taken to create an (open-access) data pool for several HVAC components or whole building systems, similar to the UCI database for machine learning, 119 to train, test, and validate FDD methods in order to make them compatible and foster performance optimizations.
Another point which makes it difficult to assess the performance of many FDD methods described above is the scarcity of articles reporting on performance evaluation measures.To evaluate the applicability of a classification method, the fraction of correctly classified data (or misclassified data) must be analyzed.Typically, this information is given in terms of a contingency table or derived quantities, such as sensitivity, specificity or accuracy. 120The complete contingency table is given in some articles. 98,112,114,117,118,121House et al. provide hit rates, corresponding to the sensitivities of the described methods. 107Cui and Wang report on detection rate (i.e., sensitivity) and false alarm rate (i.e., specificity) of their proposed method. 83Yan et al. calculate false alarm rates and accuracy 103 and Du et al. provide information on false alarm rates and missing alarm rates. 82espite some exceptions (e.g., Liu et al. 103,121 ) when methods from machine learning are employed, information about the size and nature of training, test, and validation data sets is often missing.Furthermore, most of the reviewed studies use simulation data or data from well-controlled laboratory experiments for testing their FDD methods.Only relatively few articles report applications to real-world data. 69,83,86,104,111,121This impedes evaluation of the practicability of the proposed methods.

C. Conclusion
The overview of past and current research in the field of FDD for building applications reveals some shortcomings and indicates related directions for future research.As already stated by Venkatasubramanian et al., there are a number of key requirements for an FDD method to work properly and to facilitate its implementation in real-world applications. 71Among these requirements are classification performance, robustness, adaptability, explanation facility, and novelty detection.Although a variety of methods exists, it seems that no approach based on a single method can meet all the requirements.Furthermore, as mentioned above, each method has its inherent drawbacks.It is therefore advisable to invest in the development of hybrid approaches, based on multiple methods which complement one another. 70,72 relatively new topic in the field is the development of adaptive methods, which adjust to changing conditions and learn from user feedback. 118,217The user, e.g., the facility manager, can hereby validate and improve an existing FDD method during the course of its application by identifying misclassified data.Such methods can alleviate the wellknown difficulties with parametrization of black-box models, 102 labeling of training data, and high false positive rates.
Closely related to FDD is the detection of degradation or, more generally, predictive maintenance.Here, the challenge is to detect a fault before it becomes manifest.In the end, the applied methods are similar, but the difference to FDD is in the broader range where operation is defined to be suboptimal, rather than faulty.
Finally, routines for FDD can be part of a holistic framework (refer to Fig. 1), where detected faults are analyzed and, depending on their diagnoses and impacts, corrective actions are suggested to the technical staff or automatically triggered.To this end, semantic information about the analyzed system can be exploited (see Sec. II).

IV. MODEL PREDICTIVE CONTROL
Section III described methods to achieve a fault-free building operation.Based on a fault-free operation, optimization of the building controls can be addressed.Today, systems for building operation and control typically use conventional controllers such as on/off and PID controllers.During the past three decades, Model Predictive Control (MPC) has been established in industry as a powerful method for dealing with multivariable constrained control problems. 122,123Recently, the development of MPC has also been intensively investigated for the building sector 124 and has shown to have the potential to outperform conventional control.The basis for MPC is a simulation model that predicts the evolution of the real controlled system.Using this model together with a reference trajectory r and forecasted disturbances d*, an optimization problem can be solved for a certain prediction horizon.The resulting control sequences u are then used in the real controlled system.The optimization is performed iteratively for a system-dependent control horizon.
By taking measurements and/or estimates of the initial system states x in each iteration into account, the control circuit is closed and uncertainties with regard to the simulation model and the disturbances can be compensated.In literature, the following benefits of MPC have been demonstrated: [125][126][127][128][129] • Multiple objectives: An improvement of building control with regard to several pre-defined objectives is possible.In particular, it is possible to take multiple objectives into account.
• Constraints: Easy integration of constraints, both for control and state variables, is possible.• Multiple Input Multiple Output (MIMO) systems: Interacting subsystems with possibly conflicting objectives can be taken into account simultaneously.• Inertial systems: Improved operation of systems with thermal or electrical storages can be achieved due to the integration of predictions of the system behavior and the disturbances.• Command variables: Easy integration of additional signals that affect the control objectives (e.g., time-varying energy prices) is possible.
Nevertheless, drawbacks for the implementation of MPC in real buildings exist: • Modelling: Effort has to be invested in the creation of suitable simulation models for quantifying the objectives of the controlled system.• Expertise: Qualified control engineering personnel is required for setting up the MPC.• Data: The calibration of the model requires a large amount of measurement data from the real system.Furthermore, weather and occupancy forecasts have to be provided.• Hardware: The implementation in buildings requires additional computation capacities, sensors, controls, etc. • Usability: The functions and control sequences of an MPC are not as transparent and clear as for conventional controls.Facility managers have to be trained with regard to operation and maintenance of MPC.• Costs: Due to the above mentioned points, the investment for setting up an MPC is higher than for conventional controllers.
Recent developments with regard to the availability of high-performance computing and high-quality building operational data have provided favorable conditions for the successful implementation of MPC in building control systems.

A. Methods
In this section, we discuss specific topics concerning MPC (see Fig. 3): (1) approaches for creating the dynamic system model, (2) methods for solving the optimal control problem, (3) methods for the feedback loop of the controlled system to the controller, and (4) methods for taking the uncertainties of the disturbance predictions, the model, and the state estimation into account.

Controller model
The heart of the MPC framework is constituted by a controller model that predicts the dynamic behavior of the controlled system.In the development of MPC, the creation and calibration of a suitable model represents the most costconsuming task, which is estimated to be around 70% of the total project costs. 124One distinguishes three different modeling approaches: white-, black-and gray-box models. 130 White-box models.White-box models rely on physical equations such as mass and energy balances from the domain of thermal and hygrothermal engineering.Many simulation environments for building and HVAC systems exist such as EnergyPlus, 131 TRNSYS, 132 IDA ICE, 133 or Modelica, 134 which provide physical component models, e.g., for heat pumps and boilers.Nevertheless, expert knowledge is necessary for the parametrization of the models as specific building properties require many customized parameters.Documents from the planning phase may provide parameter values, whereas these ones are often uncertain or not available at all.In order to facilitate a stringent model creation process, recent research has focused on the development of methods for automatic model parametrization based on semantic information provided by a BIM.135 The advantage of white-box models is that they generally provide high accuracy for a wide range of operating conditions due to the underlying physical equations.However, the white-box modeling approach often leads to a large number of state and algebraic variables as well as strong non-linearities of the model equations that compromise the usability in an MPC due to high computational load.Therefore, the time needed for solving the optimization problem in the MPC imposes limits on the allowable sampling time and the dimension of the optimal control problem.White-box modeling approaches have been used for offline control optimization studies [136][137][138][139] but were not intended for implementation in a real building for online optimization.
b. Black-box models.Black-box models are data-driven models in which mathematical equations for the underlying physical systems are identified solely based on available time series data.For the model identification, methods from the domain of statistics and machine learning are applied.Many different model types have been investigated for implementation in MPC: artificial neural networks (ANN), 140,141 state space models via subspace identification, 142,143 linear autoregressive models in the form of ARX (autoregressive models with exogenous inputs) 144 or ARMAX (autoregressive moving average with exogenous inputs), 145 non-linear autoregressive models NARX, 146 and fuzzy black-box models based on local linear models (LLM). 147lack-box models can provide a high accuracy and computational speed but suffer from lack of generalization capability, 125 i.e., the models may provide only uncertain predictions of system behavior for states and boundary conditions that the training data did not include.Therefore, the amount and the quality of the data are crucial for an effective model identification process.Furthermore, thorough pre-processing of the available data is one of the most important tasks regarding black-box modeling (see Sec. III).Two different sources for training data can be used: measurement data from real buildings 147 or synthetic data from white-box simulation tools. 143Gray-box models.Gray-box models aim to take advantage of white and black-box modeling approaches by combining their underlying principles and providing high computational speed that is suitable for implementation in MPC.In particular, thermal resistance-capacity (RC) networks have been investigated for the modeling of buildings and rooms.[148][149][150][151][152] The concept of gray-box models is to use expert knowledge to create a model structure which depends on a certain number of initially free uncertain parameters, which are identified in a subsequent parameter estimation procedure.This approach reduces the search space for model calibration and minimizes the amount of required measurement data.Furthermore, the given model structure is intended to enhance the validity of the models with regard to system states and boundary conditions that are not included in the training data set.
d. Impact of model complexity.A crucial question for the implementation of MPC in buildings is the necessary complexity of the simulation model.The physical processes in buildings are observed to be mostly non-linear. 128The impact of using linear models, which are computationally advantageous, instead of non-linear models in MPC was investigated in Verhelst et al. 153 for a modulating air-towater heat pump system.A comparison of two MPC approaches based on a linear and a non-linear model for the coefficient of performance (COP) of the heat pump revealed that the linear MPC led to 7% to 16% higher total energy costs.However, the reduced controller performance was mitigated by adjusting the cost function to penalize power peaks, leading to almost identical energy costs.Another approach to mitigate the effect of using linear MPC was investigated by approximating the model's non-linearities online during controller operation resulting in a linear timedependent model of the building. 154Kruppa et al. analyzed multilinear approximations of non-linear state space models that preserve the convexity properties of linear models and provide higher computational efficiency compared to the non-linear models. 155,156The impact of the model complexity with regard to the number of state variables was investigated in Ref. 157.For the investigated building, they showed that the total number of states in the controller model could be reduced from 250 to 30 while the MPC performance stayed the same.

Optimization
The following equations define a general optimal control problem, where E is the final state of the system at time t f and the integral term includes the trajectory L as a function of the system states x and inputs u: LðxðtÞ; uðtÞ; tÞ dt; (1) s:t: _ xðtÞ ¼ f ðx; u; tÞ; 8t 2 t 0 ; t f ½ ; (2) hðxðtÞ; uðtÞ; tÞ 0; 8t 2 t 0 ; t f ½ : The dynamic system model in Eq. ( 2) and its initial states in Eq. ( 3) are constraints of the optimal control problem which have to be fulfilled in every time step.Equations ( 4) and ( 5) define terminal and path constraints.The model structure greatly affects the performance of optimization algorithms.First, non-linearities in the model may lead to non-convex optimization problems, which impede the convergence to global optima. 158Second, a high model complexity may increase the time needed for the numerical integration and directly affects the selection of an optimization solver.
a. Optimization solvers.The white-box models prepared with building simulation programs are often highly nonlinear and non-differentiable with respect to the optimization variables.
0][161] The advantage of these approaches is that they allow the use of any simulation model for optimization.The simulation program serves as an (external) cost function evaluator and is iteratively invoked by the optimization solver.There are almost no restrictions on the differentiability of the model or compatibility with optimization solvers.However, the convergence speed of these optimization solvers is low due to the multitude of numerical integrations that are needed.
As an alternative, solvers that have direct access to the model equations and can employ symbolic model manipulations have been investigated.Equation-based languages such as Modelica 162 harness the potential of restructuring the optimal control problem, e.g., by discretization schemes like direct collocation.It has been shown that this approach can increase the convergence speed by factors of 2200 163 and 300 164 compared to simulation-based optimization with PSO and GPS.The performance improvement is mainly achieved by avoiding the need for numerical integration by an external simulation solver.The whole optimal control problem is transformed to a non-linear program (NLP) that can be solved by NLP solvers like IPOPT, 165 utilizing first and second order derivatives of all equations in the NLP.
In other publications, linear models, e.g., linear state space models or linearizations of non-linear models, were used for the MPC.To solve the optimal control problem, commercial software such as CPLEX 166 or algorithms such as sequential quadratic programming (SQP) 167 were employed.[170] b.MPC architecture.There are different possibilities for integrating MPC into the BAS.One option is to use MPC in a cascaded control by providing optimal set-point trajectories, which are identified by a high-level MPC, to a low-level controller in the form of a conventional PID controller. 171his can be applied to systems where it is not feasible to directly control active components in the HVAC system, for example, a mixing valve, by the MPC.By this method, the sampling time of the MPC can be increased.
Another option to tackle the high computational burden of an MPC is to use a hierarchical MPC.Here, the optimization problem is divided into two different tasks.In a highlevel MPC, optimization is performed with regard to the slow dynamics of the system and provides its output to a low-level MPC taking fast dynamics into account.][174][175] With decentralized and distributed MPC, two further approaches to divide the global optimization problem into individual (similar) subsystems have been investigated.Individual optimization problems are solved in a decentralized MPC for local parallel subsystems.For example, Pedersen et al. 176 compared a decentralized MPC for different zones in a multi-apartment building with a centralized MPC that took all zones into account at once.They have shown that the decentralized approach leads to only insignificant performance reductions as long as the interaction between the zones is weak.In distributed MPC, these interactions are also taken into account.8][179] Both of these approaches offer the advantage of a significantly reduced computational burden for the solution of the local optimization problems compared to the centralized problem.Furthermore, the functionality of the MPC may still be secured if one of the decentralized or distributed controllers fails.c.Offline MPC.Methods have been investigated that dispense completely with the need for dynamic optimizations during building operation.The concept is to learn optimized control logic from offline optimization studies.This decouples the computational burden from any time restrictions of the real operation.For the offline optimizations, typically detailed white-box models have been used.Furthermore, the same models have been used for both the controller and the emulator model of the controlled process.The obtained time series from the simulations for the system states and variables were then analyzed aiming at the extraction and deduction of nearoptimal control rules and guidelines.Moreover, the implementation of these rules in real building automation systems is intended to be realized with low effort.1][182] Coffey et al. 183 identified lookup-tables that provide controller setpoints with respect to boundary conditions like weather and occupancy.In the concept of explicit MPC, multi-parametric studies are precomputed offline for all initial states and boundary conditions of the building. 184For offline optimizations, it is crucial to cover the whole state space with all possible boundary conditions and state transitions to achieve high controller performance.[182][183]

State estimation
In conventional controllers (e.g., PID-controllers) feedback of the controlled variables is necessary and typically provided by direct measurements.This may be for example a room temperature, in the case of a room temperature control.In MPC, which uses a dynamic system model for the control, the initialization of the models requires a possibly large vector of state variables.In reality, not all system states that are necessary for the model can be measured directly.Furthermore, the measured states are affected by measurement noise and uncertainties.Therefore, state estimation techniques are required that provide estimations of model states based on a limited number of measured states.The initial states greatly affect the feasibility of the optimizations performed as well as the identified optimal control trajectories. 185he most commonly used method for the state estimation is a Kalman filter. 166,186The original Kalman filter is limited to linear systems.For non-linear systems, extensions like the extended Kalman filter and the unscented Kalman filter exist.All Kalman filters work in two steps: in the prediction step, the states are estimated and as soon as a new measurement of the real system is provided, a correction step is performed, taking the deviation of the prediction from the measurement into account to update the model.
Another concept is the use of an optimization-based estimation. 187,188Here, a dynamic simulation model, possibly the controller model, and measurements of the states of the investigated system for a past period are used.Based on these measurements, optimization techniques are utilized to obtain the system states at the actual time step.The approach is deterministic, since the dynamic model is considered to be accurate.In an extended version called moving horizon estimation (MHE), not-modeled disturbances are integrated into the objective function and the system model to take the mismatch between the real system and the model into account.The concepts involving dynamic optimization with a system model are only feasible for systems with slow dynamics, due to the required simulation time.
the estimation of the initial states and the predictions of disturbances.The dynamic system model suffers from inaccuracy in the prediction of the real system behavior due to simplifications that have to be made in the modeling process.In most of the reviewed publications, the model parameters are calibrated once in the development phase before the MPC is implemented in the BAS.In some publications, 186,[189][190][191] online updates of the model parameters are employed in order to take changes in the building such as seasonal variations, altered occupancy behavior or upgraded equipment into account.Maasoumy et al. 186 compared nominal MPC and so-called robust MPC (RMPC) in which the building model parameters are adapted online.They have shown that the RMPC outperforms nominal MPC beyond a certain level of model uncertainty.Furthermore, there is a certain level of model uncertainty when the performance of RMPC is even worse than the conventional rule-based control.
The impact of the uncertainty of state estimations on the performance of MPC in buildings has not been widely studied.Antonov et al. 192 investigated the robustness of MPC for a hybrid ground coupled heat pump system.The applied method identifies a reliable estimation of the maximum allowed degree of state estimation uncertainty beyond which the system performance decreases.
The predictions of disturbances are associated with uncertainties with regard to weather, occupancy, and other boundary conditions such as time-varying energy prices or thermal load predictions that the controller model may use.][196][197] The results showed that SMPC outperformed deterministic MPC with regard to thermal comfort violations.In terms of energy usage, the SMPC may lead to more conservative but more reliable results.The amount and quality of available input data for weather, occupancy and other boundary conditions greatly influences the performance of SMPC.

B. Performance evaluation
Model predictive control has been investigated for many different HVAC systems and buildings.Most publications focused on commercial buildings, while residential buildings played only a minor role.The objectives of the control optimization were chosen very different but mainly focused on energy consumption reductions, improved indoor comfort, and cost savings.The system type and configuration mainly affected the performance improvements that MPC could achieve.In the following, examples for the application of MPC in specific HVAC systems for the optimal control of thermal and hydraulic actuators are presented.For the concrete numbers for the specific performance quantification, it is referred to the original publications.
Investigations have focused on different strategies for the achievement of energy consumption reductions in buildings.Many publications aim at the optimal management of thermal and electrical storages in buildings like, for example, active thermal buffer storages, 198 slow response radiant slabs, 137,170,174 or passive building thermal mass. 199,200ther publications focus on the improvement of the energy efficiency of individual components for the energy transformation such as AHU 145,146 or heat pumps. 153The increase in the building energy efficiency by the integration of renewable energy sources was investigated for example for hybrid free cooling systems, 161,201 solar absorption chillers, 202 or photovoltaic thermal hybrid solar collectors. 143In buildings, HVAC systems may be constituted of different redundant heat generation systems which provide different efficiencies dependent on boundary conditions.By mean of MPC, De Coninck et al. 187 identified optimal schedules for the operation of the systems when their energy efficiencies is the highest.Furthermore, MPC can harness information about occupancy in order to reduce the energy demand in nonoccupied times. 194mproved indoor thermal comfort is a second major objective of MPC in buildings, while the formulation of the optimal control problem varies between approaches.On the one hand, a term for the quantification of comfort violations is integrated in the objective function together with energy consumption associated with weighting factors.Then, the variation of the weighting factors enables the identification of Pareto fronts. 159On the other hand, soft or hard constraints in the optimal control problem formulation associated with certain room temperature bounds allow accounting for the thermal comfort. 203ost savings via MPC are investigated by taking prices for the consumed fuels or electricity into account.Many recent publications consider not only individual buildings and their HVAC systems but also their interaction with the energy grid.This was done by using some boundary signals from the grid like for example time of use tariffs (TOU) or stock market prices.Then, demand side management was used by active and passive storages in the building to shift load from high-price to low-price periods. 142,144,204,205ost of the studies with regard to MPC in buildings used simulations for emulating the controlled systems and to quantify the impact of the optimal control.Only a few publications presented real implementations and demonstrations of MPC in existing buildings: reductions of energy costs were reported by shifting the building energy demand to off-peak hours with reduced electricity prices.• Ma et al. 207 demonstrated the MPC for an cooling system in a university building in Merced, California.The COP of the cooling systems could be increased by 19%.• Siroky et al. 171 tested an MPC for an university building in Prague, Czech Republic for the control optimization of an heating system with radiant ceilings.The MPC achieved energy savings between 15% and 28%.• Killian et al. 208 implemented an MPC for an university building with a total floor area of 13.000 m2 in Salzburg, Austria.They focused on the operation optimization of TABS.Doing this, reduced variances of room temperatures and energy savings between 31% and 36% could be achieved.

C. Conclusion
This section gave an overview of model predictive control for the optimization of building operation.Crucial methods, such as modeling and optimization techniques as well as the handling of state estimation and uncertainties, were presented taking recent research into account.Advantages and drawbacks for the realization of MPC in buildings were discussed.In the past 10 years, MPC in buildings has gained a lot of attention in the scientific community.A vast number of publications shows significant improvements with regard to costs, energy consumption, indoor environment, and carbon dioxide emissions in buildings, which were achieved through MPC.
Nevertheless, MPC is not yet established as a standard method for operation and control in buildings.The trade-off between effort, expertise, hardware and achieved profits does not reach a satisfactory balance yet.For a successful deployment in the market, tools, and frameworks that facilitate the setup of MPC is required.
First, the modeling effort has to be reduced.Methods for the creation of dynamic system models should make use of the upcoming trend of digitization in the building sector and utilize information from BIM or similar data sources, which are generated during the planning phase of buildings.The usage of this information can support the automated model development and thus lead to productivity gains.Furthermore, other applications, such as FDD, can advantageously use this common information basis.
Second, methods that facilitate the coupling of the models with appropriate optimization solvers for the specific optimal control problems need to be developed.Therefore, the expertise of control engineers with regard to efficient algorithms for the solution of optimal control problems needs to be combined with that of building engineers who understand the physical processes in building energy systems and formulate concepts for the operation optimization.
Finally, MPC provides the capability to manage the whole energy operation of buildings efficiently.Furthermore, it is a suitable method for tackling the challenges of the future energy system, such as the integration of fluctuating renewable energies and the sector coupling issues, e.g., between electric mobility and building energy demand.

V. CONCLUSION AND OUTLOOK
In this article, we provided an overview of the different steps required for building performance optimization and reviewed selected parts of the whole process which are scientifically most relevant.(see Fig. 1).This study reveals that the high amount of research activities in the field led to significant advances in the digitalization of buildings and in the supervision and control of their operation.Current observations indicate that solutions including automated information capture and model generation for existing buildings, predictive control, and web-based IoT platforms capable of implementing these analytics are emerging.These developments can lead to productivity increase in the AEC and FM sectors.Furthermore, these advancements have the potential to minimize the environmental impact of buildings and thus to contribute to the reduction of global greenhouse gas emissions.Nevertheless, issues and open research questions are remaining.
First, top-level data analysis and meta data handling for large systems are hampered by the complexity and the amount of heterogeneous data generated during the whole life cycle of buildings.The integration of additional communication, control and supervision capabilities in decentralized components and thus direct analysis within these components and data exchange between sensors, actors and controllers might alleviate this issue. 209,210][213][214] Furthermore, highly automated routines which possibly rely on confidential data naturally rise questions of data security and responsibilities.Some promising developments are hindered or lack real-world applications due to resentments or missing legal coverage of this topic in the AEC and FM sectors.A quite novel approach to tackle related issues are so-called smart contracts, which are based on the blockchain technology.While some people speak very enthusiastically about these new developments and expect cooperation and trust to be implemented via blockchain technology, 215 others claim that these technologies cannot resolve the typical lack of organization and collaboration in the building and construction sector. 216

TABLE I .
Classification scheme for fault detection methods.
Fault detection based onA: Expert knowledge B: (Labeled) measurement data i: Prediction Physical models E.g. regression models, neural networks (NN), and qualitative models ii: Classification IF-THEN rules E.g. decision trees, support vector machines (SVM), Bayes classifiers, and logistic regression iii: Outlier detection IF-THEN rules E.g. density-based clustering methods and principal component analysis (PCA)

•
De Coninck et al. 187 demonstrated a MPC for an office building in Brussels, Belgium with two floors and 480 m 2 floor area.The MPC achieved energy consumption reductions of more than 30% by shifting operation time from boilers to heat pumps.• West et al. 189 realized a MPC for two office buildings in Melbourne, Australia, and Newcastle, England for a period of 51 days.The focus was on the operation of air handling units.Energy savings of 19% and 32% were documented.• Vana et al. 172 implemented and tested a MPC in a large office building with 5 floors and 1500 m 2 floor area in Hasselt, Belgium.The heating is provided by thermoactive building systems (TABS).Energy savings of about 17% could be achieved.• Ma et al. 206 tested a MPC for an air handling unit in an office building in Milwaukee, Wisconsin.Significant 041501-13 Benndorf, Wystrcil, and R ehault Appl.Phys.Rev. 5, 041501 (2018)