科研论文 | 对建筑信息学中距离度量的深入探究

2022-04-12 20:57  



Distance measures in building informatics: An in-depth assessment through typical tasks in building energy management

 

李奥,范成,肖赋,陈志杰
Energy and Buildings
Volume 258,  2022



摘要:

Distance measurement (also known as similarity measurement) is used to evaluate pairwise similarities between data samples. It has been widely used in diverse building informatics research and applications to classify or cluster massive building data with the aim of improving prediction accuracy, identifying operation patterns, benchmarking and diagnosing building performance, etc. Various distance measures have been adopted to measure the distance/similarity of building data. However, the intrinsic complexity and diversity of building operational data bring considerable difficulties to the selection of a suitable distance measure for a specific task. There is a strong and urgent need for a comprehensive review and systematic comparison of existing distance measures in building informatics. This study provides a comprehensive review of various distance measures and their applications in building operational data analysis. A systematic comparison is undertaken based on two typical tasks relying on building informatics, i.e., building energy usage pattern recognition, and clustering-based weather data segmentation for the customized development of building energy prediction models. Nine widely adopted distance measures have been reviewed and compared, including Euclidean distance, Chebyshev distance, Manhattan distance, Mahalanobis distance, Hausdorff distance, Pearson correlation distance, Dynamic Time Warping, Edit distance on Real Sequence, and Cosine distance. Novel internal and external clustering validation approaches based on the cross-test and prediction accuracy are proposed and adopted to compare the clustering performance. The results in case studies showed that weather data clustering using the Cosine distance and Pearson correlation distance helps to obtain better energy prediction results in terms of MAPE (13.22% and 12.91%, respectively) than the commonly-used Euclidean distance (13.99%). The results also revealed that better clustering performance does not necessarily lead to higher prediction accuracy. The research results and insights obtained are valuable to guide distancebased research in building informatics.


关键词:

距离度量;聚类;模式识别;时序分析


原文链接:点击左下角“阅读原文




编辑 | Cloris

1. 科研论文 | GEIN:基于机器学习的针对所有建筑类型的可解释基准测试框架

2. 科研论文 | 基于贝叶斯方法的水冷式冷水机组流量测量不确定性的量化

3. 科研论文 | 基于模型预测的调度策略以解锁和优化面向多服务电力市场的建筑能源灵活性

1. 学术新闻 | 王盛卫教授和肖赋教授入列全球前2%顶尖科学家
2. 学术新闻 | 恭喜王盛卫教授及其团队获得香港研究资助局重大项目(CRF)资助!
3. 学术新闻 | 肖赋教授获批国家重点研发计划“政府间国际科技创新合作”项目


香港理大建筑能源与自动化研究室
香港理工大学 | 建筑节能及自动化研究室
 最新文章