Early crowd forecasting away from stations by geographically complemented regression using transit search and mobility logs

Abstract Forecasting crowd gatherings in advance, such as 1 week before they happen, plays a vital role in ensuring smooth mobility and public safety. Although early crowd forecasting has become possible by leveraging visitors’ mobility schedules extracted from transit search logs, the forecasting a...

Full description

Saved in:
Bibliographic Details
Main Authors: Soto Anno, Kota Tsubouchi, Masamichi Shimosaka
Format: Article
Language:English
Published: SpringerOpen 2025-07-01
Series:Journal of Big Data
Subjects:
Online Access:https://doi.org/10.1186/s40537-025-01214-6
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Forecasting crowd gatherings in advance, such as 1 week before they happen, plays a vital role in ensuring smooth mobility and public safety. Although early crowd forecasting has become possible by leveraging visitors’ mobility schedules extracted from transit search logs, the forecasting area is limited to regions near railroad stations because the logs do not explicitly reflect, but only implicitly, the locations away from stations where people go after arriving. To address this issue, this paper presents an early crowd forecasting method capable of predicting crowding a week in advance in both station vicinities and areas away from stations by introducing an innovative crowd forecasting model called geographically complemented multi-task Poisson regression (GCPR). Our method infers the flows of people after they arrive at railroad stations based on GPS-based mobility logs and transit search logs by leveraging the heterogeneous characteristics of nearby stations. Specifically, the model forecasts the number of visitors to an event 1 week in advance by using transit search logs recorded more than 1 week prior to the event, along with contextual features (such as day of the week) and time information. Furthermore, the model performs multi-task learning for station arrival schedules and mobility patterns, addressing the challenge of accurately predicting people flow to congestion points based on geographical and mobility proximity between stations and crowded areas. We conduct an empirical evaluation using a real-world dataset that includes 12 large-scale events held in Japan from 2019 to 2020, such as the Jingu Gaien Fireworks Festival, the Comik Market 96, and the Rugby World Cup 2019. Results demonstrate that the GCPR can forecast crowd gatherings 1 week before their occurrence in areas previously challenging to predict, achieving up to 42% performance improvement over CityOutlook+, a state-of-the-art approach for early crowd forecasting.
ISSN:2196-1115