A Data Engineering Framework for Ethereum Beacon Chain Rewards: From Data Collection to Decentralization Metrics

Abstract Ethereum, one of the leading smart contract blockchain platforms, currently operates on a Proof-of-Stake (PoS) consensus mechanism designed to secure the network while incentivizing desired validator behaviors. Despite blockchain technology’s promise of decentralization, limitations and gap...

Full description

Saved in:
Bibliographic Details
Main Authors: Tao Yan, Shengnan Li, Benjamin Kraner, Luyao Zhang, Claudio J. Tessone
Format: Article
Language:English
Published: Nature Portfolio 2025-03-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-025-04623-7
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850207991849025536
author Tao Yan
Shengnan Li
Benjamin Kraner
Luyao Zhang
Claudio J. Tessone
author_facet Tao Yan
Shengnan Li
Benjamin Kraner
Luyao Zhang
Claudio J. Tessone
author_sort Tao Yan
collection DOAJ
description Abstract Ethereum, one of the leading smart contract blockchain platforms, currently operates on a Proof-of-Stake (PoS) consensus mechanism designed to secure the network while incentivizing desired validator behaviors. Despite blockchain technology’s promise of decentralization, limitations and gaps in decentralization persist, posing challenges for analysis and optimization. This study introduces a comprehensive dataset of validator rewards from the Ethereum Beacon chain, categorized into attestation, proposer, and sync committee rewards. By providing granular, transparent, and auditable records of validator activities, the dataset addresses the fragmentation of raw blockchain data and enables robust evaluations of PoS incentive structures. Researchers can leverage this dataset to assess enforceable rules, verify protocol compliance, and analyze long-term validator behavior. In addition, we apply decentralization metrics such as the Shannon entropy, Gini Index, Nakamoto Coefficient, and Herfindahl-Hirschman Index (HHI) to showcase the dataset’s utility in studying decentralization trends. Publicly available on Harvard Dataverse and accompanied by open-source analytical tools on GitHub, this dataset facilitates future research aimed at enhancing blockchain systems’ decentralization, security, and efficiency.
format Article
id doaj-art-da30bc002f4540a58807701427eb6080
institution OA Journals
issn 2052-4463
language English
publishDate 2025-03-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj-art-da30bc002f4540a58807701427eb60802025-08-20T02:10:20ZengNature PortfolioScientific Data2052-44632025-03-0112111110.1038/s41597-025-04623-7A Data Engineering Framework for Ethereum Beacon Chain Rewards: From Data Collection to Decentralization MetricsTao Yan0Shengnan Li1Benjamin Kraner2Luyao Zhang3Claudio J. Tessone4Blockchain & Distributed Ledger Technologies Group at Department of Informatics and UZH Blockchain Center, University of ZurichBlockchain & Distributed Ledger Technologies Group at Department of Informatics and UZH Blockchain Center, University of ZurichBlockchain & Distributed Ledger Technologies Group at Department of Informatics and UZH Blockchain Center, University of ZurichData Science Research Center and Social Science Division, Duke Kunshan UniversityBlockchain & Distributed Ledger Technologies Group at Department of Informatics and UZH Blockchain Center, University of ZurichAbstract Ethereum, one of the leading smart contract blockchain platforms, currently operates on a Proof-of-Stake (PoS) consensus mechanism designed to secure the network while incentivizing desired validator behaviors. Despite blockchain technology’s promise of decentralization, limitations and gaps in decentralization persist, posing challenges for analysis and optimization. This study introduces a comprehensive dataset of validator rewards from the Ethereum Beacon chain, categorized into attestation, proposer, and sync committee rewards. By providing granular, transparent, and auditable records of validator activities, the dataset addresses the fragmentation of raw blockchain data and enables robust evaluations of PoS incentive structures. Researchers can leverage this dataset to assess enforceable rules, verify protocol compliance, and analyze long-term validator behavior. In addition, we apply decentralization metrics such as the Shannon entropy, Gini Index, Nakamoto Coefficient, and Herfindahl-Hirschman Index (HHI) to showcase the dataset’s utility in studying decentralization trends. Publicly available on Harvard Dataverse and accompanied by open-source analytical tools on GitHub, this dataset facilitates future research aimed at enhancing blockchain systems’ decentralization, security, and efficiency.https://doi.org/10.1038/s41597-025-04623-7
spellingShingle Tao Yan
Shengnan Li
Benjamin Kraner
Luyao Zhang
Claudio J. Tessone
A Data Engineering Framework for Ethereum Beacon Chain Rewards: From Data Collection to Decentralization Metrics
Scientific Data
title A Data Engineering Framework for Ethereum Beacon Chain Rewards: From Data Collection to Decentralization Metrics
title_full A Data Engineering Framework for Ethereum Beacon Chain Rewards: From Data Collection to Decentralization Metrics
title_fullStr A Data Engineering Framework for Ethereum Beacon Chain Rewards: From Data Collection to Decentralization Metrics
title_full_unstemmed A Data Engineering Framework for Ethereum Beacon Chain Rewards: From Data Collection to Decentralization Metrics
title_short A Data Engineering Framework for Ethereum Beacon Chain Rewards: From Data Collection to Decentralization Metrics
title_sort data engineering framework for ethereum beacon chain rewards from data collection to decentralization metrics
url https://doi.org/10.1038/s41597-025-04623-7
work_keys_str_mv AT taoyan adataengineeringframeworkforethereumbeaconchainrewardsfromdatacollectiontodecentralizationmetrics
AT shengnanli adataengineeringframeworkforethereumbeaconchainrewardsfromdatacollectiontodecentralizationmetrics
AT benjaminkraner adataengineeringframeworkforethereumbeaconchainrewardsfromdatacollectiontodecentralizationmetrics
AT luyaozhang adataengineeringframeworkforethereumbeaconchainrewardsfromdatacollectiontodecentralizationmetrics
AT claudiojtessone adataengineeringframeworkforethereumbeaconchainrewardsfromdatacollectiontodecentralizationmetrics
AT taoyan dataengineeringframeworkforethereumbeaconchainrewardsfromdatacollectiontodecentralizationmetrics
AT shengnanli dataengineeringframeworkforethereumbeaconchainrewardsfromdatacollectiontodecentralizationmetrics
AT benjaminkraner dataengineeringframeworkforethereumbeaconchainrewardsfromdatacollectiontodecentralizationmetrics
AT luyaozhang dataengineeringframeworkforethereumbeaconchainrewardsfromdatacollectiontodecentralizationmetrics
AT claudiojtessone dataengineeringframeworkforethereumbeaconchainrewardsfromdatacollectiontodecentralizationmetrics