暖通空调>期刊目次>2019年>第2期

基于数据挖掘技术的北方寒冷地区居民用水、用气数据处理方法探讨

Data processing method of residential water and gas data in northern cold zone based on data mining technology

周浩[1] 林波荣[1] 张仲宸[1] 戚建强[2] 郑立红[2] 常晨晨[3]
[1]清华大学 [2]天津生态城绿色建筑研究院有限公司 [3]中国建筑设计研究院有限公司

摘要:

对于一个城市尺度的居民用气、用水数据系统,其数据量之大通常是人力无法或难以处理与分析的,往往需要借助于数据挖掘技术。基于天津市某城区3个小区用户2 a的用水、用气数据,采用数据标准化、基于临近性检测、箱线图等数据挖掘方法,对用户数据进行了用能异常、邻月用能数据变化异常等检测,并横向比较了3个小区的总体用能水平。结合问卷调研数据,提出了通过信息增益理论及C4.5决策树等数据挖掘算法建立用户用能水平与用户特征及用户行为间的关联关系的方法。本文的研究工作展示了从建筑用能数据中提取有效信息的过程,可为建筑能耗数据管理平台构建和应用提供新思路。

关键词:数据挖掘,异常点检测,箱线图,信息增益率,C4.5决策树

Abstract:

The amount of residential water and gas data system for a city is too large to be manually processed, which requires the support of data mining technology. Based on a 2-year survey on the water and gas data of three communities in Tianjin, presents the processes and results of data processing, analyses outlier detection of energy use and its change in the adjacent two months and compares the overall energy use levels among the three communities, using data mining methods such as data normalization, outlier detection based on proximity and boxplot. Combined with the questionnaire survey data, proposes a data mining approach to explore the correlation between occupants’ energy use levels and their social characteristics and energy related behaviors through information gain theory and C4.5 decision tree. Presents the methodology of extracting useful information from building energy use data, which is expected to assist the platform construction of energy use data management and its application.

Keywords:datamining,outlierdetection,boxplot,informationgainrate(IGR),C4.5decisiontree

    你还没注册?或者没有登录?这篇期刊要求至少是本站的注册会员才能阅读!

    如果你还没注册,请赶紧点此注册吧!

    如果你已经注册但还没登录,请赶紧点此登录吧!