Abnormal Data Region Discrimination and Cross-Monitoring Points Historical Correlation Repair of Water Intake Data

Big Data. 2019 Jun;7(2):99-113. doi: 10.1089/big.2018.0148. Epub 2019 May 10.

Abstract

For the problems of abnormal values existing in the water intake monitoring data and centralized uploaded report, the abnormal data region discrimination (ADRD) algorithm and the cross-monitoring points historical correlation repair (CMHCR) method are proposed to discriminate and repair the abnormal data. The characteristics of abnormal data distribution are analyzed, and the ADRD algorithm is proposed. ADRD uses the relationship between 0 values and the abnormal large value, and the ratio of the abnormal large value to the expectation to distinguish the abnormal data region. The correlation between the monitoring data of current detection points and the historical data of different detection points is analyzed. The results show that the data of current monitoring point and the historical data of corresponding point do not fully conform to the maximum correlation. Therefore, the CMHCR method is proposed to repair abnormal data. Experiments based on actual half year water intake data of 2016 and 2017 are performed by using ADRD. The experimental results show that the proposed algorithm and method can correctly distinguish the abnormal data region and repair the abnormal data properly.

Keywords: abnormal data region; data distribution; data repair; historical correlation; statistical characteristics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Big Data*
  • China
  • Water Resources*