Estimating potential evapotranspiration based on self-optimizing nearest neighbor algorithms: a case study in arid-semiarid environments, Northwest of China

Environ Sci Pollut Res Int. 2020 Oct;27(30):37176-37187. doi: 10.1007/s11356-019-06597-7. Epub 2019 Oct 25.

Abstract

Changes in potential evapotranspiration will affect the surface ecology and environment of the land. Accurate and quick estimation of potential evapotranspiration will help to analyze environmental change. In this study, in combination with the canonical correlation analysis (CCA) and k-nearest neighbor algorithm (k-NN), a new method for calculating potential evapotranspiration (CCA-k-NN) based on self-optimizing nearest neighbor algorithm was proposed, in which less meteorological data were used for estimation. By analyzing the basic principles of CCA and k-NN and according to the requirement of estimating ET0, the CCA-k-NN method was constructed, and its basic principles and key steps were described. In this method, CCA algorithm was used to find the most relevant meteorological data for potential evapotranspiration, and the dimensionality of meteorological data for subsequent estimation of ET0 was reduced. Then, k-NN algorithm was used to estimate ET0. The Northwest of China was chosen as the research area to evaluate the applicability of this method. The 148 data stations in the region were divided into training datasets, testing datasets, and validation datasets. ET0 was estimated on three datasets using the proposed method, and the estimation accuracy of the CCA-k-NN method was evaluated with FAO-56 Penman-Monteith as a reference. The results show that the CCA-k-NN method maintains a high correlation with FAO-56 Penman-Monteith (correlation coefficient is greater than 0.9) and has a good estimation accuracy. RMSE and MAE are both less than 1 mm day-1, and the overall performance of NSCE is greater than 0.5, all of which reach the level of "applicable" and above. At the same time, the CCA-k-NN method has low time complexity O(n). Comparison of the results of the CCA-k-NN method with those of other empirical models showed that the CCA-k-NN method is more accurate and can be employed successfully in estimating ET0.

Keywords: Arid; Canonical correlation analysis; Limited meteorological data; Northwest China; Potential evapotranspiration; k-Nearest neighbor algorithm; semiarid environments.

MeSH terms

  • Algorithms
  • China
  • Crops, Agricultural
  • Meteorology*
  • Plant Transpiration*