An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity

Objective Accurate estimation of river surface flow velocity is critical for optimizing hydropower generation efficiency and enhancing flood warning systems. Existing deep learning models face challenges in generalization due to limited training samples and data heterogeneity across complex river en...

Full description

Saved in:
Bibliographic Details
Main Authors: 周继威, 安国成, 王根一
Format: Article
Language:English
Published: Editorial Department of Journal of Sichuan University (Engineering Science Edition) 2025-01-01
Series:工程科学与技术
Subjects:
Online Access:http://jsuese.scu.edu.cn/thesisDetails#10.12454/j.jsuese.202400869
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850271031429693440
author 周继威
安国成
王根一
author_facet 周继威
安国成
王根一
author_sort 周继威
collection DOAJ
description Objective Accurate estimation of river surface flow velocity is critical for optimizing hydropower generation efficiency and enhancing flood warning systems. Existing deep learning models face challenges in generalization due to limited training samples and data heterogeneity across complex river environments. This study aimed to address these limitations by developing a robust framework that integrates improved data labeling strategies, optimized training methods, and a novel neural network architecture to achieve high-precision velocity estimation. MethodsMethods A dual-output branch structure was designed within an improved U-Net architecture to simultaneously supervise velocity distribution and pixel displacement values, enhancing feature representation and model robustness. The framework incorporated a spatiotemporal training strategy, where continuous video sequences were utilized to capture temporal dynamics. Data preprocessing included multi-resolution image acquisition (2560×1 440, 1 920×1 080, 1 280×720) from simulated river channels and natural environments. Two label generation methods—bell-shaped and stepped distributions—were proposed to encode velocity information into grayscale maps, balancing precision and computational efficiency. Multi-scale random cropping (160~640 pixels) and a customized masking strategy were applied to augment training data, focusing on critical river channel regions. Model performance was evaluated using RMSE, MAVD (matching accuracy of velocity distribution), and FRA (Flow Rate Accuracy) metrics across the Hehai, Belgium, and real-world CDJJB datasets.Results and DiscussionsThe proposed framework demonstrated superior performance across diverse datasets, achieving an average <italic>R</italic><sub>MSE</sub> of 0.137 on the Hehai dataset (subranges: H(0-1)=0.069, H(1-2)=0.103, H(2-3)=0.238) and M<sub>AE</sub>=0.109, outperforming Fast optical flow (RMSE=0.272), CBEGAN ( RMSE=0.217), and the Two-stage method ( RMSE=0.183) by 50%~80%. On the Belgium dataset, RMSE and MAE were reduced to 0.059 and 0.047, respectively, while cross-dataset validation on real-world CDJJB data confirmed robustness (RMSE=0.086, MAE=0.073) under natural turbulence and lighting variations. These results validated the framework’s ability to address data heterogeneity and limited training samples through integrated spatiotemporal training and adaptive augmentation.Multi-scale random cropping (160~640 pixels) significantly enhanced velocity recognition accuracy. Larger crops (640 pixels) preserved spatial context, achieving MAVD=0.862 and FRA=0.927, whereas smaller crops (160 pixels) limited velocity detection to 1.5 m/s, inadequate for high-flow scenarios (&gt;3 m/s). Intermediate scales (320 pixels) balanced computational efficiency and accuracy (MAVD=0.816), with training loss curves revealing accelerated convergence for 640-pixel inputs, reducing loss saturation by 40% compared to 160-pixel crops. Customized masking strategies further improved precision by focusing on critical river regions. Mask B, aligned with high-velocity zones, achieved MAVD=0.892 and FRA=0.942, surpassing random masking (MAVD=0.881) by 4% and reducing prediction uncertainty by 23% in flows &gt;2 m/s. In contrast, random masking (k=3, s=0.3) degraded accuracy (MAVD=0.722) due to excessive occlusion of hydrodynamic features, underscoring the necessity of domain-specific augmentation.Temporal sequence optimization revealed N=32 as the optimal video length, balancing spatiotemporal feature extraction (MAVD=0.816, FRA=0.894). Longer sequences (N=48/64) introduced redundancy (R=0.35~0.42), degrading MAVD by 8%–12% and increasing inference time by 72% (N=64: 38 ms vs. N=32: 22 ms). Information entropy analysis confirmed redundant features (R&gt;0.3) in extended sequences increased computational complexity without improving accuracy. Comparative analysis with state-of-the-art methods highlighted the framework’s advantages. For high-velocity flows (&gt;2 m/s), the dual-output architecture reduced RMSE to 0.238, a 70% improvement over traditional optical flow (OTV: 0.794). Real-time inference at 22 ms/frame surpassed CBEGAN (67 ms) and the Two-stage method (41 ms). On the Belgium dataset, MAE=0.047 in low-flow conditions (&lt;0.8 m/s) outperformed OTV (0.074) and CBEGAN (0.095). Cross-dataset validation under natural turbulence (CDJJB) maintained MAE=0.073, demonstrating adaptability to environmental heterogeneity.The integration of adaptive label generation (bell-shaped/stepped distributions) and spatiotemporal training addressed data scarcity by leveraging sequential video dynamics. Mask B’s longitudinal alignment with high-velocity zones ensured focused learning on hydrodynamic features, while multi-scale cropping enhanced generalization through spatial context retention.. These results validate the framework’s potential for optimizing energy output and flood management, while emphasizing the necessity of expanded datasets to address environmental variability. ConclusionsThe improved U-Net framework effectively addresses challenges in river surface velocity estimation by integrating spatiotemporal training, adaptive labeling, and data augmentation. The dual-output structure ensures accurate velocity mapping and displacement prediction, while multi-scale cropping and targeted masking enhance generalization. Experimental results validate the method’s efficiency and accuracy across diverse datasets, with significant implications for hydropower optimization and flood management. Limitations in handling low-light conditions highlight the need for nighttime dataset expansion in future work.
format Article
id doaj-art-4b3fed0b321a4352893c05bb94c61b1c
institution OA Journals
issn 2096-3246
language English
publishDate 2025-01-01
publisher Editorial Department of Journal of Sichuan University (Engineering Science Edition)
record_format Article
series 工程科学与技术
spelling doaj-art-4b3fed0b321a4352893c05bb94c61b1c2025-08-20T01:52:22ZengEditorial Department of Journal of Sichuan University (Engineering Science Edition)工程科学与技术2096-32462025-01-01112102008804An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity周继威安国成王根一Objective Accurate estimation of river surface flow velocity is critical for optimizing hydropower generation efficiency and enhancing flood warning systems. Existing deep learning models face challenges in generalization due to limited training samples and data heterogeneity across complex river environments. This study aimed to address these limitations by developing a robust framework that integrates improved data labeling strategies, optimized training methods, and a novel neural network architecture to achieve high-precision velocity estimation. MethodsMethods A dual-output branch structure was designed within an improved U-Net architecture to simultaneously supervise velocity distribution and pixel displacement values, enhancing feature representation and model robustness. The framework incorporated a spatiotemporal training strategy, where continuous video sequences were utilized to capture temporal dynamics. Data preprocessing included multi-resolution image acquisition (2560×1 440, 1 920×1 080, 1 280×720) from simulated river channels and natural environments. Two label generation methods—bell-shaped and stepped distributions—were proposed to encode velocity information into grayscale maps, balancing precision and computational efficiency. Multi-scale random cropping (160~640 pixels) and a customized masking strategy were applied to augment training data, focusing on critical river channel regions. Model performance was evaluated using RMSE, MAVD (matching accuracy of velocity distribution), and FRA (Flow Rate Accuracy) metrics across the Hehai, Belgium, and real-world CDJJB datasets.Results and DiscussionsThe proposed framework demonstrated superior performance across diverse datasets, achieving an average <italic>R</italic><sub>MSE</sub> of 0.137 on the Hehai dataset (subranges: H(0-1)=0.069, H(1-2)=0.103, H(2-3)=0.238) and M<sub>AE</sub>=0.109, outperforming Fast optical flow (RMSE=0.272), CBEGAN ( RMSE=0.217), and the Two-stage method ( RMSE=0.183) by 50%~80%. On the Belgium dataset, RMSE and MAE were reduced to 0.059 and 0.047, respectively, while cross-dataset validation on real-world CDJJB data confirmed robustness (RMSE=0.086, MAE=0.073) under natural turbulence and lighting variations. These results validated the framework’s ability to address data heterogeneity and limited training samples through integrated spatiotemporal training and adaptive augmentation.Multi-scale random cropping (160~640 pixels) significantly enhanced velocity recognition accuracy. Larger crops (640 pixels) preserved spatial context, achieving MAVD=0.862 and FRA=0.927, whereas smaller crops (160 pixels) limited velocity detection to 1.5 m/s, inadequate for high-flow scenarios (&gt;3 m/s). Intermediate scales (320 pixels) balanced computational efficiency and accuracy (MAVD=0.816), with training loss curves revealing accelerated convergence for 640-pixel inputs, reducing loss saturation by 40% compared to 160-pixel crops. Customized masking strategies further improved precision by focusing on critical river regions. Mask B, aligned with high-velocity zones, achieved MAVD=0.892 and FRA=0.942, surpassing random masking (MAVD=0.881) by 4% and reducing prediction uncertainty by 23% in flows &gt;2 m/s. In contrast, random masking (k=3, s=0.3) degraded accuracy (MAVD=0.722) due to excessive occlusion of hydrodynamic features, underscoring the necessity of domain-specific augmentation.Temporal sequence optimization revealed N=32 as the optimal video length, balancing spatiotemporal feature extraction (MAVD=0.816, FRA=0.894). Longer sequences (N=48/64) introduced redundancy (R=0.35~0.42), degrading MAVD by 8%–12% and increasing inference time by 72% (N=64: 38 ms vs. N=32: 22 ms). Information entropy analysis confirmed redundant features (R&gt;0.3) in extended sequences increased computational complexity without improving accuracy. Comparative analysis with state-of-the-art methods highlighted the framework’s advantages. For high-velocity flows (&gt;2 m/s), the dual-output architecture reduced RMSE to 0.238, a 70% improvement over traditional optical flow (OTV: 0.794). Real-time inference at 22 ms/frame surpassed CBEGAN (67 ms) and the Two-stage method (41 ms). On the Belgium dataset, MAE=0.047 in low-flow conditions (&lt;0.8 m/s) outperformed OTV (0.074) and CBEGAN (0.095). Cross-dataset validation under natural turbulence (CDJJB) maintained MAE=0.073, demonstrating adaptability to environmental heterogeneity.The integration of adaptive label generation (bell-shaped/stepped distributions) and spatiotemporal training addressed data scarcity by leveraging sequential video dynamics. Mask B’s longitudinal alignment with high-velocity zones ensured focused learning on hydrodynamic features, while multi-scale cropping enhanced generalization through spatial context retention.. These results validate the framework’s potential for optimizing energy output and flood management, while emphasizing the necessity of expanded datasets to address environmental variability. ConclusionsThe improved U-Net framework effectively addresses challenges in river surface velocity estimation by integrating spatiotemporal training, adaptive labeling, and data augmentation. The dual-output structure ensures accurate velocity mapping and displacement prediction, while multi-scale cropping and targeted masking enhance generalization. Experimental results validate the method’s efficiency and accuracy across diverse datasets, with significant implications for hydropower optimization and flood management. Limitations in handling low-light conditions highlight the need for nighttime dataset expansion in future work.http://jsuese.scu.edu.cn/thesisDetails#10.12454/j.jsuese.202400869hydropower generationflow velocity estimationimproved U-Netspatiotemporal trainingdata augmentation
spellingShingle 周继威
安国成
王根一
An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity
工程科学与技术
hydropower generation
flow velocity estimation
improved U-Net
spatiotemporal training
data augmentation
title An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity
title_full An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity
title_fullStr An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity
title_full_unstemmed An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity
title_short An Improved U-Net-Based Framework for Estimating River Surface Flow Velocity
title_sort improved u net based framework for estimating river surface flow velocity
topic hydropower generation
flow velocity estimation
improved U-Net
spatiotemporal training
data augmentation
url http://jsuese.scu.edu.cn/thesisDetails#10.12454/j.jsuese.202400869
work_keys_str_mv AT zhōujìwēi animprovedunetbasedframeworkforestimatingriversurfaceflowvelocity
AT ānguóchéng animprovedunetbasedframeworkforestimatingriversurfaceflowvelocity
AT wánggēnyī animprovedunetbasedframeworkforestimatingriversurfaceflowvelocity
AT zhōujìwēi improvedunetbasedframeworkforestimatingriversurfaceflowvelocity
AT ānguóchéng improvedunetbasedframeworkforestimatingriversurfaceflowvelocity
AT wánggēnyī improvedunetbasedframeworkforestimatingriversurfaceflowvelocity