An integrated dataset of spatiotemporal and event data in elite soccer

Abstract Data-driven match analysis in soccer is a growing discipline in both research and practice. However, public data is scarce, which raises the barrier for entering this field and decreases reproducibility of methods and results. To bridge this gap, this paper presents a dataset of official ma...

Full description

Saved in:
Bibliographic Details
Main Authors: Manuel Bassek, Robert Rein, Hendrik Weber, Daniel Memmert
Format: Article
Language:English
Published: Nature Portfolio 2025-02-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-025-04505-y
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832571980047450112
author Manuel Bassek
Robert Rein
Hendrik Weber
Daniel Memmert
author_facet Manuel Bassek
Robert Rein
Hendrik Weber
Daniel Memmert
author_sort Manuel Bassek
collection DOAJ
description Abstract Data-driven match analysis in soccer is a growing discipline in both research and practice. However, public data is scarce, which raises the barrier for entering this field and decreases reproducibility of methods and results. To bridge this gap, this paper presents a dataset of official match information, event, and position data from seven matches of the German Bundesliga’s first and second division. The match information contains meta data about the matches and their participants. The event data contain timestamps along with descriptions of discrete events, like passes, shots, or fouls. The position data contain the x/y-coordinates of every player and the ball. By integrating multiple data modalities – i.e., event logs with timestamps, and x-y coordinates of player and ball positions — the dataset offers a multidimensional view of match dynamics. This dataset supports the validation of existing analytical techniques and facilitates the development of new methodologies in sports analytics. With availability under CC-BY 4.0, it promotes transparency, reproducibility, and the idea of open science in match analysis research.
format Article
id doaj-art-14b7ae2f856d4553a5c8867d6577f353
institution Kabale University
issn 2052-4463
language English
publishDate 2025-02-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj-art-14b7ae2f856d4553a5c8867d6577f3532025-02-02T12:08:05ZengNature PortfolioScientific Data2052-44632025-02-0112111110.1038/s41597-025-04505-yAn integrated dataset of spatiotemporal and event data in elite soccerManuel Bassek0Robert Rein1Hendrik Weber2Daniel Memmert3Institute of Exercise Training and Sport Informatics, German Sport University CologneInstitute of Exercise Training and Sport Informatics, German Sport University CologneDFL, German Football LeagueInstitute of Exercise Training and Sport Informatics, German Sport University CologneAbstract Data-driven match analysis in soccer is a growing discipline in both research and practice. However, public data is scarce, which raises the barrier for entering this field and decreases reproducibility of methods and results. To bridge this gap, this paper presents a dataset of official match information, event, and position data from seven matches of the German Bundesliga’s first and second division. The match information contains meta data about the matches and their participants. The event data contain timestamps along with descriptions of discrete events, like passes, shots, or fouls. The position data contain the x/y-coordinates of every player and the ball. By integrating multiple data modalities – i.e., event logs with timestamps, and x-y coordinates of player and ball positions — the dataset offers a multidimensional view of match dynamics. This dataset supports the validation of existing analytical techniques and facilitates the development of new methodologies in sports analytics. With availability under CC-BY 4.0, it promotes transparency, reproducibility, and the idea of open science in match analysis research.https://doi.org/10.1038/s41597-025-04505-y
spellingShingle Manuel Bassek
Robert Rein
Hendrik Weber
Daniel Memmert
An integrated dataset of spatiotemporal and event data in elite soccer
Scientific Data
title An integrated dataset of spatiotemporal and event data in elite soccer
title_full An integrated dataset of spatiotemporal and event data in elite soccer
title_fullStr An integrated dataset of spatiotemporal and event data in elite soccer
title_full_unstemmed An integrated dataset of spatiotemporal and event data in elite soccer
title_short An integrated dataset of spatiotemporal and event data in elite soccer
title_sort integrated dataset of spatiotemporal and event data in elite soccer
url https://doi.org/10.1038/s41597-025-04505-y
work_keys_str_mv AT manuelbassek anintegrateddatasetofspatiotemporalandeventdatainelitesoccer
AT robertrein anintegrateddatasetofspatiotemporalandeventdatainelitesoccer
AT hendrikweber anintegrateddatasetofspatiotemporalandeventdatainelitesoccer
AT danielmemmert anintegrateddatasetofspatiotemporalandeventdatainelitesoccer
AT manuelbassek integrateddatasetofspatiotemporalandeventdatainelitesoccer
AT robertrein integrateddatasetofspatiotemporalandeventdatainelitesoccer
AT hendrikweber integrateddatasetofspatiotemporalandeventdatainelitesoccer
AT danielmemmert integrateddatasetofspatiotemporalandeventdatainelitesoccer