An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity

Objective Accurate estimation of river surface flow velocity is critical for optimizing hydropower generation efficiency and enhancing flood warning systems. Existing deep learning models face challenges in generalization due to limited training samples and data heterogeneity across complex river en...

Full description

Saved in:

Bibliographic Details
Main Authors:	周继威, 安国成, 王根一
Format:	Article
Language:	English
Published:	Editorial Department of Journal of Sichuan University (Engineering Science Edition) 2025-01-01
Series:	工程科学与技术
Subjects:	hydropower generation flow velocity estimation improved U-Net spatiotemporal training data augmentation
Online Access:	http://jsuese.scu.edu.cn/thesisDetails#10.12454/j.jsuese.202400869
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850271031429693440
author	周继威安国成王根一
author_facet	周继威安国成王根一
author_sort	周继威
collection	DOAJ
description	Objective Accurate estimation of river surface flow velocity is critical for optimizing hydropower generation efficiency and enhancing flood warning systems. Existing deep learning models face challenges in generalization due to limited training samples and data heterogeneity across complex river environments. This study aimed to address these limitations by developing a robust framework that integrates improved data labeling strategies, optimized training methods, and a novel neural network architecture to achieve high-precision velocity estimation. MethodsMethods A dual-output branch structure was designed within an improved U-Net architecture to simultaneously supervise velocity distribution and pixel displacement values, enhancing feature representation and model robustness. The framework incorporated a spatiotemporal training strategy, where continuous video sequences were utilized to capture temporal dynamics. Data preprocessing included multi-resolution image acquisition (2560×1 440, 1 920×1 080, 1 280×720) from simulated river channels and natural environments. Two label generation methods—bell-shaped and stepped distributions—were proposed to encode velocity information into grayscale maps, balancing precision and computational efficiency. Multi-scale random cropping (160~640 pixels) and a customized masking strategy were applied to augment training data, focusing on critical river channel regions. Model performance was evaluated using RMSE, MAVD (matching accuracy of velocity distribution), and FRA (Flow Rate Accuracy) metrics across the Hehai, Belgium, and real-world CDJJB datasets.Results and DiscussionsThe proposed framework demonstrated superior performance across diverse datasets, achieving an average <italic>R</italic><sub>MSE</sub> of 0.137 on the Hehai dataset (subranges: H(0-1)=0.069, H(1-2)=0.103, H(2-3)=0.238) and M<sub>AE</sub>=0.109, outperforming Fast optical flow (RMSE=0.272), CBEGAN ( RMSE=0.217), and the Two-stage method ( RMSE=0.183) by 50%~80%. On the Belgium dataset, RMSE and MAE were reduced to 0.059 and 0.047, respectively, while cross-dataset validation on real-world CDJJB data confirmed robustness (RMSE=0.086, MAE=0.073) under natural turbulence and lighting variations. These results validated the framework’s ability to address data heterogeneity and limited training samples through integrated spatiotemporal training and adaptive augmentation.Multi-scale random cropping (160~640 pixels) significantly enhanced velocity recognition accuracy. Larger crops (640 pixels) preserved spatial context, achieving MAVD=0.862 and FRA=0.927, whereas smaller crops (160 pixels) limited velocity detection to 1.5 m/s, inadequate for high-flow scenarios (>3 m/s). Intermediate scales (320 pixels) balanced computational efficiency and accuracy (MAVD=0.816), with training loss curves revealing accelerated convergence for 640-pixel inputs, reducing loss saturation by 40% compared to 160-pixel crops. Customized masking strategies further improved precision by focusing on critical river regions. Mask B, aligned with high-velocity zones, achieved MAVD=0.892 and FRA=0.942, surpassing random masking (MAVD=0.881) by 4% and reducing prediction uncertainty by 23% in flows >2 m/s. In contrast, random masking (k=3, s=0.3) degraded accuracy (MAVD=0.722) due to excessive occlusion of hydrodynamic features, underscoring the necessity of domain-specific augmentation.Temporal sequence optimization revealed N=32 as the optimal video length, balancing spatiotemporal feature extraction (MAVD=0.816, FRA=0.894). Longer sequences (N=48/64) introduced redundancy (R=0.35~0.42), degrading MAVD by 8%–12% and increasing inference time by 72% (N=64: 38 ms vs. N=32: 22 ms). Information entropy analysis confirmed redundant features (R>0.3) in extended sequences increased computational complexity without improving accuracy. Comparative analysis with state-of-the-art methods highlighted the framework’s advantages. For high-velocity flows (>2 m/s), the dual-output architecture reduced RMSE to 0.238, a 70% improvement over traditional optical flow (OTV: 0.794). Real-time inference at 22 ms/frame surpassed CBEGAN (67 ms) and the Two-stage method (41 ms). On the Belgium dataset, MAE=0.047 in low-flow conditions (<0.8 m/s) outperformed OTV (0.074) and CBEGAN (0.095). Cross-dataset validation under natural turbulence (CDJJB) maintained MAE=0.073, demonstrating adaptability to environmental heterogeneity.The integration of adaptive label generation (bell-shaped/stepped distributions) and spatiotemporal training addressed data scarcity by leveraging sequential video dynamics. Mask B’s longitudinal alignment with high-velocity zones ensured focused learning on hydrodynamic features, while multi-scale cropping enhanced generalization through spatial context retention.. These results validate the framework’s potential for optimizing energy output and flood management, while emphasizing the necessity of expanded datasets to address environmental variability. ConclusionsThe improved U-Net framework effectively addresses challenges in river surface velocity estimation by integrating spatiotemporal training, adaptive labeling, and data augmentation. The dual-output structure ensures accurate velocity mapping and displacement prediction, while multi-scale cropping and targeted masking enhance generalization. Experimental results validate the method’s efficiency and accuracy across diverse datasets, with significant implications for hydropower optimization and flood management. Limitations in handling low-light conditions highlight the need for nighttime dataset expansion in future work.
format	Article
id	doaj-art-4b3fed0b321a4352893c05bb94c61b1c
institution	OA Journals
issn	2096-3246
language	English
publishDate	2025-01-01
publisher	Editorial Department of Journal of Sichuan University (Engineering Science Edition)
record_format	Article
series	工程科学与技术
spelling	doaj-art-4b3fed0b321a4352893c05bb94c61b1c2025-08-20T01:52:22ZengEditorial Department of Journal of Sichuan University (Engineering Science Edition)工程科学与技术2096-32462025-01-01112102008804An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity周继威安国成王根一Objective Accurate estimation of river surface flow velocity is critical for optimizing hydropower generation efficiency and enhancing flood warning systems. Existing deep learning models face challenges in generalization due to limited training samples and data heterogeneity across complex river environments. This study aimed to address these limitations by developing a robust framework that integrates improved data labeling strategies, optimized training methods, and a novel neural network architecture to achieve high-precision velocity estimation. MethodsMethods A dual-output branch structure was designed within an improved U-Net architecture to simultaneously supervise velocity distribution and pixel displacement values, enhancing feature representation and model robustness. The framework incorporated a spatiotemporal training strategy, where continuous video sequences were utilized to capture temporal dynamics. Data preprocessing included multi-resolution image acquisition (2560×1 440, 1 920×1 080, 1 280×720) from simulated river channels and natural environments. Two label generation methods—bell-shaped and stepped distributions—were proposed to encode velocity information into grayscale maps, balancing precision and computational efficiency. Multi-scale random cropping (160~640 pixels) and a customized masking strategy were applied to augment training data, focusing on critical river channel regions. Model performance was evaluated using RMSE, MAVD (matching accuracy of velocity distribution), and FRA (Flow Rate Accuracy) metrics across the Hehai, Belgium, and real-world CDJJB datasets.Results and DiscussionsThe proposed framework demonstrated superior performance across diverse datasets, achieving an average <italic>R</italic><sub>MSE</sub> of 0.137 on the Hehai dataset (subranges: H(0-1)=0.069, H(1-2)=0.103, H(2-3)=0.238) and M<sub>AE</sub>=0.109, outperforming Fast optical flow (RMSE=0.272), CBEGAN ( RMSE=0.217), and the Two-stage method ( RMSE=0.183) by 50%~80%. On the Belgium dataset, RMSE and MAE were reduced to 0.059 and 0.047, respectively, while cross-dataset validation on real-world CDJJB data confirmed robustness (RMSE=0.086, MAE=0.073) under natural turbulence and lighting variations. These results validated the framework’s ability to address data heterogeneity and limited training samples through integrated spatiotemporal training and adaptive augmentation.Multi-scale random cropping (160~640 pixels) significantly enhanced velocity recognition accuracy. Larger crops (640 pixels) preserved spatial context, achieving MAVD=0.862 and FRA=0.927, whereas smaller crops (160 pixels) limited velocity detection to 1.5 m/s, inadequate for high-flow scenarios (>3 m/s). Intermediate scales (320 pixels) balanced computational efficiency and accuracy (MAVD=0.816), with training loss curves revealing accelerated convergence for 640-pixel inputs, reducing loss saturation by 40% compared to 160-pixel crops. Customized masking strategies further improved precision by focusing on critical river regions. Mask B, aligned with high-velocity zones, achieved MAVD=0.892 and FRA=0.942, surpassing random masking (MAVD=0.881) by 4% and reducing prediction uncertainty by 23% in flows >2 m/s. In contrast, random masking (k=3, s=0.3) degraded accuracy (MAVD=0.722) due to excessive occlusion of hydrodynamic features, underscoring the necessity of domain-specific augmentation.Temporal sequence optimization revealed N=32 as the optimal video length, balancing spatiotemporal feature extraction (MAVD=0.816, FRA=0.894). Longer sequences (N=48/64) introduced redundancy (R=0.35~0.42), degrading MAVD by 8%–12% and increasing inference time by 72% (N=64: 38 ms vs. N=32: 22 ms). Information entropy analysis confirmed redundant features (R>0.3) in extended sequences increased computational complexity without improving accuracy. Comparative analysis with state-of-the-art methods highlighted the framework’s advantages. For high-velocity flows (>2 m/s), the dual-output architecture reduced RMSE to 0.238, a 70% improvement over traditional optical flow (OTV: 0.794). Real-time inference at 22 ms/frame surpassed CBEGAN (67 ms) and the Two-stage method (41 ms). On the Belgium dataset, MAE=0.047 in low-flow conditions (<0.8 m/s) outperformed OTV (0.074) and CBEGAN (0.095). Cross-dataset validation under natural turbulence (CDJJB) maintained MAE=0.073, demonstrating adaptability to environmental heterogeneity.The integration of adaptive label generation (bell-shaped/stepped distributions) and spatiotemporal training addressed data scarcity by leveraging sequential video dynamics. Mask B’s longitudinal alignment with high-velocity zones ensured focused learning on hydrodynamic features, while multi-scale cropping enhanced generalization through spatial context retention.. These results validate the framework’s potential for optimizing energy output and flood management, while emphasizing the necessity of expanded datasets to address environmental variability. ConclusionsThe improved U-Net framework effectively addresses challenges in river surface velocity estimation by integrating spatiotemporal training, adaptive labeling, and data augmentation. The dual-output structure ensures accurate velocity mapping and displacement prediction, while multi-scale cropping and targeted masking enhance generalization. Experimental results validate the method’s efficiency and accuracy across diverse datasets, with significant implications for hydropower optimization and flood management. Limitations in handling low-light conditions highlight the need for nighttime dataset expansion in future work.http://jsuese.scu.edu.cn/thesisDetails#10.12454/j.jsuese.202400869hydropower generationflow velocity estimationimproved U-Netspatiotemporal trainingdata augmentation
spellingShingle	周继威安国成王根一 An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity 工程科学与技术 hydropower generation flow velocity estimation improved U-Net spatiotemporal training data augmentation
title	An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity
title_full	An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity
title_fullStr	An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity
title_full_unstemmed	An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity
title_short	An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity
title_sort	improved u net based framework for estimating river surface flow velocity
topic	hydropower generation flow velocity estimation improved U-Net spatiotemporal training data augmentation
url	http://jsuese.scu.edu.cn/thesisDetails#10.12454/j.jsuese.202400869
work_keys_str_mv	AT zhōujìwēi animprovedunetbasedframeworkforestimatingriversurfaceflowvelocity AT ānguóchéng animprovedunetbasedframeworkforestimatingriversurfaceflowvelocity AT wánggēnyī animprovedunetbasedframeworkforestimatingriversurfaceflowvelocity AT zhōujìwēi improvedunetbasedframeworkforestimatingriversurfaceflowvelocity AT ānguóchéng improvedunetbasedframeworkforestimatingriversurfaceflowvelocity AT wánggēnyī improvedunetbasedframeworkforestimatingriversurfaceflowvelocity

An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity

Similar Items