Urban Road Anomaly Monitoring Using Vision–Language Models for Enhanced Safety Management

Abnormal phenomena on urban roads, including uneven surfaces, garbage, traffic congestion, floods, fallen trees, fires, and traffic accidents, present significant risks to public safety and infrastructure, necessitating real-time monitoring and early warning systems. This study develops Urban Road A...

Full description

Saved in:
Bibliographic Details
Main Authors: Hanyu Ding, Yawei Du, Zhengyu Xia
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/5/2517
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850034796382650368
author Hanyu Ding
Yawei Du
Zhengyu Xia
author_facet Hanyu Ding
Yawei Du
Zhengyu Xia
author_sort Hanyu Ding
collection DOAJ
description Abnormal phenomena on urban roads, including uneven surfaces, garbage, traffic congestion, floods, fallen trees, fires, and traffic accidents, present significant risks to public safety and infrastructure, necessitating real-time monitoring and early warning systems. This study develops Urban Road Anomaly Visual Large Language Models (URA-VLMs), a generative AI-based framework designed for the monitoring of diverse urban road anomalies. The InternVL was selected as a foundational model due to its adaptability for this monitoring purpose. The URA-VLMs framework features dedicated modules for anomaly detection, flood depth estimation, and safety level assessment, utilizing multi-step prompting and retrieval-augmented generation (RAG) for precise and adaptive analysis. A comprehensive dataset of 3034 annotated images depicting various urban road scenarios was developed to evaluate the models. Experimental results demonstrate the system’s effectiveness, achieving an overall anomaly detection accuracy of 93.20%, outperforming state-of-the-art models such as InternVL2.5 and ResNet34. By facilitating early detection and real-time decision-making, this generative AI approach offers a scalable and robust solution that contributes to a smarter, safer road environment.
format Article
id doaj-art-e6b85beb5b3b49609fe1005811819963
institution DOAJ
issn 2076-3417
language English
publishDate 2025-02-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-e6b85beb5b3b49609fe10058118199632025-08-20T02:57:41ZengMDPI AGApplied Sciences2076-34172025-02-01155251710.3390/app15052517Urban Road Anomaly Monitoring Using Vision–Language Models for Enhanced Safety ManagementHanyu Ding0Yawei Du1Zhengyu Xia2State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100024, ChinaDepartment of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, Hong Kong 999077, ChinaState Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100024, ChinaAbnormal phenomena on urban roads, including uneven surfaces, garbage, traffic congestion, floods, fallen trees, fires, and traffic accidents, present significant risks to public safety and infrastructure, necessitating real-time monitoring and early warning systems. This study develops Urban Road Anomaly Visual Large Language Models (URA-VLMs), a generative AI-based framework designed for the monitoring of diverse urban road anomalies. The InternVL was selected as a foundational model due to its adaptability for this monitoring purpose. The URA-VLMs framework features dedicated modules for anomaly detection, flood depth estimation, and safety level assessment, utilizing multi-step prompting and retrieval-augmented generation (RAG) for precise and adaptive analysis. A comprehensive dataset of 3034 annotated images depicting various urban road scenarios was developed to evaluate the models. Experimental results demonstrate the system’s effectiveness, achieving an overall anomaly detection accuracy of 93.20%, outperforming state-of-the-art models such as InternVL2.5 and ResNet34. By facilitating early detection and real-time decision-making, this generative AI approach offers a scalable and robust solution that contributes to a smarter, safer road environment.https://www.mdpi.com/2076-3417/15/5/2517urban road safetyvision large language modelroad anomalyreal-time monitoringsafety management
spellingShingle Hanyu Ding
Yawei Du
Zhengyu Xia
Urban Road Anomaly Monitoring Using Vision–Language Models for Enhanced Safety Management
Applied Sciences
urban road safety
vision large language model
road anomaly
real-time monitoring
safety management
title Urban Road Anomaly Monitoring Using Vision–Language Models for Enhanced Safety Management
title_full Urban Road Anomaly Monitoring Using Vision–Language Models for Enhanced Safety Management
title_fullStr Urban Road Anomaly Monitoring Using Vision–Language Models for Enhanced Safety Management
title_full_unstemmed Urban Road Anomaly Monitoring Using Vision–Language Models for Enhanced Safety Management
title_short Urban Road Anomaly Monitoring Using Vision–Language Models for Enhanced Safety Management
title_sort urban road anomaly monitoring using vision language models for enhanced safety management
topic urban road safety
vision large language model
road anomaly
real-time monitoring
safety management
url https://www.mdpi.com/2076-3417/15/5/2517
work_keys_str_mv AT hanyuding urbanroadanomalymonitoringusingvisionlanguagemodelsforenhancedsafetymanagement
AT yaweidu urbanroadanomalymonitoringusingvisionlanguagemodelsforenhancedsafetymanagement
AT zhengyuxia urbanroadanomalymonitoringusingvisionlanguagemodelsforenhancedsafetymanagement