An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity
Objective Accurate estimation of river surface flow velocity is critical for optimizing hydropower generation efficiency and enhancing flood warning systems. Existing deep learning models face challenges in generalization due to limited training samples and data heterogeneity across complex river en...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Editorial Department of Journal of Sichuan University (Engineering Science Edition)
2025-01-01
|
| Series: | 工程科学与技术 |
| Subjects: | |
| Online Access: | http://jsuese.scu.edu.cn/thesisDetails#10.12454/j.jsuese.202400869 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Objective Accurate estimation of river surface flow velocity is critical for optimizing hydropower generation efficiency and enhancing flood warning systems. Existing deep learning models face challenges in generalization due to limited training samples and data heterogeneity across complex river environments. This study aimed to address these limitations by developing a robust framework that integrates improved data labeling strategies, optimized training methods, and a novel neural network architecture to achieve high-precision velocity estimation. MethodsMethods A dual-output branch structure was designed within an improved U-Net architecture to simultaneously supervise velocity distribution and pixel displacement values, enhancing feature representation and model robustness. The framework incorporated a spatiotemporal training strategy, where continuous video sequences were utilized to capture temporal dynamics. Data preprocessing included multi-resolution image acquisition (2560×1 440, 1 920×1 080, 1 280×720) from simulated river channels and natural environments. Two label generation methods—bell-shaped and stepped distributions—were proposed to encode velocity information into grayscale maps, balancing precision and computational efficiency. Multi-scale random cropping (160~640 pixels) and a customized masking strategy were applied to augment training data, focusing on critical river channel regions. Model performance was evaluated using RMSE, MAVD (matching accuracy of velocity distribution), and FRA (Flow Rate Accuracy) metrics across the Hehai, Belgium, and real-world CDJJB datasets.Results and DiscussionsThe proposed framework demonstrated superior performance across diverse datasets, achieving an average <italic>R</italic><sub>MSE</sub> of 0.137 on the Hehai dataset (subranges: H(0-1)=0.069, H(1-2)=0.103, H(2-3)=0.238) and M<sub>AE</sub>=0.109, outperforming Fast optical flow (RMSE=0.272), CBEGAN ( RMSE=0.217), and the Two-stage method ( RMSE=0.183) by 50%~80%. On the Belgium dataset, RMSE and MAE were reduced to 0.059 and 0.047, respectively, while cross-dataset validation on real-world CDJJB data confirmed robustness (RMSE=0.086, MAE=0.073) under natural turbulence and lighting variations. These results validated the framework’s ability to address data heterogeneity and limited training samples through integrated spatiotemporal training and adaptive augmentation.Multi-scale random cropping (160~640 pixels) significantly enhanced velocity recognition accuracy. Larger crops (640 pixels) preserved spatial context, achieving MAVD=0.862 and FRA=0.927, whereas smaller crops (160 pixels) limited velocity detection to 1.5 m/s, inadequate for high-flow scenarios (>3 m/s). Intermediate scales (320 pixels) balanced computational efficiency and accuracy (MAVD=0.816), with training loss curves revealing accelerated convergence for 640-pixel inputs, reducing loss saturation by 40% compared to 160-pixel crops. Customized masking strategies further improved precision by focusing on critical river regions. Mask B, aligned with high-velocity zones, achieved MAVD=0.892 and FRA=0.942, surpassing random masking (MAVD=0.881) by 4% and reducing prediction uncertainty by 23% in flows >2 m/s. In contrast, random masking (k=3, s=0.3) degraded accuracy (MAVD=0.722) due to excessive occlusion of hydrodynamic features, underscoring the necessity of domain-specific augmentation.Temporal sequence optimization revealed N=32 as the optimal video length, balancing spatiotemporal feature extraction (MAVD=0.816, FRA=0.894). Longer sequences (N=48/64) introduced redundancy (R=0.35~0.42), degrading MAVD by 8%–12% and increasing inference time by 72% (N=64: 38 ms vs. N=32: 22 ms). Information entropy analysis confirmed redundant features (R>0.3) in extended sequences increased computational complexity without improving accuracy. Comparative analysis with state-of-the-art methods highlighted the framework’s advantages. For high-velocity flows (>2 m/s), the dual-output architecture reduced RMSE to 0.238, a 70% improvement over traditional optical flow (OTV: 0.794). Real-time inference at 22 ms/frame surpassed CBEGAN (67 ms) and the Two-stage method (41 ms). On the Belgium dataset, MAE=0.047 in low-flow conditions (<0.8 m/s) outperformed OTV (0.074) and CBEGAN (0.095). Cross-dataset validation under natural turbulence (CDJJB) maintained MAE=0.073, demonstrating adaptability to environmental heterogeneity.The integration of adaptive label generation (bell-shaped/stepped distributions) and spatiotemporal training addressed data scarcity by leveraging sequential video dynamics. Mask B’s longitudinal alignment with high-velocity zones ensured focused learning on hydrodynamic features, while multi-scale cropping enhanced generalization through spatial context retention.. These results validate the framework’s potential for optimizing energy output and flood management, while emphasizing the necessity of expanded datasets to address environmental variability. ConclusionsThe improved U-Net framework effectively addresses challenges in river surface velocity estimation by integrating spatiotemporal training, adaptive labeling, and data augmentation. The dual-output structure ensures accurate velocity mapping and displacement prediction, while multi-scale cropping and targeted masking enhance generalization. Experimental results validate the method’s efficiency and accuracy across diverse datasets, with significant implications for hydropower optimization and flood management. Limitations in handling low-light conditions highlight the need for nighttime dataset expansion in future work. |
|---|---|
| ISSN: | 2096-3246 |