Application of machine learning in forensic geochemistry using presalt oil samples from the Santos basin

Abstract Identifying oil spills in offshore production areas presents a critical challenge, requiring reliable and efficient methodologies to minimize environmental and economic impacts. Traditional approaches are often time-consuming, subjective, and limited in their ability to provide accurate pre...

Full description

Saved in:
Bibliographic Details
Main Authors: Gil Marcio Avelino Silva, Fernando Pellon de Miranda, Jarbas Vicente Poley Guzzo, Wagner Leonel Bastos, Ygor Rocha, Igor Viegas Alves Fernandes de Souza, Italo Oliveira Matias, Sarah Barron Torres, Francisco Fabio de Araujo Ponte
Format: Article
Language:English
Published: Nature Portfolio 2025-05-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-00084-5
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Identifying oil spills in offshore production areas presents a critical challenge, requiring reliable and efficient methodologies to minimize environmental and economic impacts. Traditional approaches are often time-consuming, subjective, and limited in their ability to provide accurate predictions. This study introduces a novel methodology that integrates geochemical data analysis with machine learning techniques to enhance the identification of oil spill origins. A dataset comprising 2200 presalt oil samples and 75 attributes from the Santos Basin underwent preprocessing and exploratory analysis, resulting in 2137 samples and 62 predictive attributes. Seven machine learning algorithms were evaluated, with the random forest model achieving the highest classification accuracy of 91%. The methodology was validated using three independent oil samples (spill events and one natural seep), demonstrating its robustness in accurately predicting field origins with high confidence. The integration of machine learning techniques and geochemical analysis reduced the subjectivity of human interpretation, significantly accelerated diagnostic workflows, and provided reliable results in minutes. This approach represents a scalable and innovative solution for both exploratory and forensic geochemistry, particularly in complex production areas along the Brazilian coast. The proposed methodology has the potential to enhance decision-making processes in environmental monitoring and oil exploration.
ISSN:2045-2322