2024, 22(2):388-398.
Abstract:
With China's extensive water transfers projects underway, the focus has shifted towards optimizing their operation, highlighting the significance of pumping station efficiency studies. The efficiency characteristic curve, a fundamental feature, plays a crucial role in optimizing station operation by utilizing measured head-flow data. However, long-term operation introduces efficiency errors, stemming from design inaccuracies, mechanical losses, fluid friction, operational errors, and improper maintenance. This is evident in cases like pumping Unit 4 at the Pizhou station, where substantial disparities between actual and theoretical efficiency exist, necessitating precise efficiency simulation models to align optimization schemes with the actual optimal state. Recent endeavors have integrated machine learning algorithms like polynomial regression, Gaussian process regression, and neural networks into hydraulic forecasting and simulation, offering promising avenues for pumping station efficiency simulation. Therefore, employing artificial intelligence techniques to investigate pumping station efficiency simulation was proposed, focusing on a representative station of the Eastern Route of the South-to-North Water Transfers Project.The efficiency simulation of pumping units in water management systems is a critical task, demanding meticulous preprocessing of operational data and the selection of appropriate modeling techniques. Initially, data preprocessing involves aligning time-stamped measurements, clustering data into windows, and handling anomalies to ensure data quality. Various influencing factors, such as flow rates, water levels, and blade rotating angles, are scrutinized to optimize efficiency modeling. Traditional methods, like Polynomial Regression and Multivariate Linear Regression, are contrasted with advanced techniques including decision regression trees, support vector regression, Gaussian process regression, and neural networks. Each method offers unique advantages, such as the interpretability of decision trees and the flexibility of neural networks. Training these models involves careful parameter selection and validation using established metrics like root mean squared error and determination coefficient. Python and MATLAB are prominent tools used for implementation, offering libraries and functions tailored for regression tasks.
The average indicators of the eight pumping units indicate that GPR (Gaussian process regression) models with three different kernel functions (RQ, SE, E) exhibit the best overall performance in simulating the efficiency of the four units at the Pizhou station and the four units at the Suining station. The indicator shows the three GPR models are around 0.34 to 0.36, while ANN, DNN, and MLR are slightly above 0.5, with other models showing poorer performance. In terms of the
R2
indicator, except for DRT and SVM models, which are approximately between 0.7 and 0.8, all other models score above 0.9. Regarding the
EMI
indicator, traditional polynomials (2nd PR and 3rd PR) perform the worst, while other models are within approximately 5, with the three GPR models ranging from 3.2 to 3.5, showing better performance. Considering the five metrics
ERMS,EMA,R2,EMS and EMI, the GPR models demonstrate the best overall performance among various traditional and machine learning methods in the comprehensive testing. Comparing efficiency simulation methods, incorporating station upstream and downstream water levels alongside traditional head as features yielded superior results, notably enhancing model performance in various metrics. This approach, particularly evident in GPR models, addresses non-linear relationships and potential error sources in head calculations. Utilizing station water levels directly improves model accuracy, offering a more intuitive analysis of influencing factors.In conclusion, after analyzing ten regression models for pump efficiency simulation, the GPR model emerged as the most effective, outperforming traditional polynomial methods. Evaluation metrics showed significantly superior performance of GPR over other models, evidenced by reduced errors across various indicators when applied to training datasets of eight pump units at two stations. Substituting station water levels for traditional head as input features yielded notable improvements in model accuracy, particularly evident in GPR models. For instance, using GPR for efficiency simulation at one pump unit resulted in average and maximum absolute errors within 0.50% and 3.20% respectively, while employing water levels instead of head further reduced these errors to 0.41% and 2.30%. This enhancement signifies a substantial improvement over current methods, offering precise efficiency simulation crucial for optimizing pump station operations.