Application of machine learning in forensic geochemistry using presalt oil samples from the Santos basin

Abstract Identifying oil spills in offshore production areas presents a critical challenge, requiring reliable and efficient methodologies to minimize environmental and economic impacts. Traditional approaches are often time-consuming, subjective, and limited in their ability to provide accurate pre...

Full description

Saved in:
Bibliographic Details
Main Authors: Gil Marcio Avelino Silva, Fernando Pellon de Miranda, Jarbas Vicente Poley Guzzo, Wagner Leonel Bastos, Ygor Rocha, Igor Viegas Alves Fernandes de Souza, Italo Oliveira Matias, Sarah Barron Torres, Francisco Fabio de Araujo Ponte
Format: Article
Language:English
Published: Nature Portfolio 2025-05-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-00084-5
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850284424199929856
author Gil Marcio Avelino Silva
Fernando Pellon de Miranda
Jarbas Vicente Poley Guzzo
Wagner Leonel Bastos
Ygor Rocha
Igor Viegas Alves Fernandes de Souza
Italo Oliveira Matias
Sarah Barron Torres
Francisco Fabio de Araujo Ponte
author_facet Gil Marcio Avelino Silva
Fernando Pellon de Miranda
Jarbas Vicente Poley Guzzo
Wagner Leonel Bastos
Ygor Rocha
Igor Viegas Alves Fernandes de Souza
Italo Oliveira Matias
Sarah Barron Torres
Francisco Fabio de Araujo Ponte
author_sort Gil Marcio Avelino Silva
collection DOAJ
description Abstract Identifying oil spills in offshore production areas presents a critical challenge, requiring reliable and efficient methodologies to minimize environmental and economic impacts. Traditional approaches are often time-consuming, subjective, and limited in their ability to provide accurate predictions. This study introduces a novel methodology that integrates geochemical data analysis with machine learning techniques to enhance the identification of oil spill origins. A dataset comprising 2200 presalt oil samples and 75 attributes from the Santos Basin underwent preprocessing and exploratory analysis, resulting in 2137 samples and 62 predictive attributes. Seven machine learning algorithms were evaluated, with the random forest model achieving the highest classification accuracy of 91%. The methodology was validated using three independent oil samples (spill events and one natural seep), demonstrating its robustness in accurately predicting field origins with high confidence. The integration of machine learning techniques and geochemical analysis reduced the subjectivity of human interpretation, significantly accelerated diagnostic workflows, and provided reliable results in minutes. This approach represents a scalable and innovative solution for both exploratory and forensic geochemistry, particularly in complex production areas along the Brazilian coast. The proposed methodology has the potential to enhance decision-making processes in environmental monitoring and oil exploration.
format Article
id doaj-art-fe689754258141d79f4c86fe9a3bf819
institution OA Journals
issn 2045-2322
language English
publishDate 2025-05-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-fe689754258141d79f4c86fe9a3bf8192025-08-20T01:47:33ZengNature PortfolioScientific Reports2045-23222025-05-0115111610.1038/s41598-025-00084-5Application of machine learning in forensic geochemistry using presalt oil samples from the Santos basinGil Marcio Avelino Silva0Fernando Pellon de Miranda1Jarbas Vicente Poley Guzzo2Wagner Leonel Bastos3Ygor Rocha4Igor Viegas Alves Fernandes de Souza5Italo Oliveira Matias6Sarah Barron Torres7Francisco Fabio de Araujo Ponte8Petroleo Brasileiro S.A.Petroleo Brasileiro S.A.Petroleo Brasileiro S.A.Petroleo Brasileiro S.A.Petroleo Brasileiro S.A.Petroleo Brasileiro S.A.PUC-RioPUC-RioPUC-RioAbstract Identifying oil spills in offshore production areas presents a critical challenge, requiring reliable and efficient methodologies to minimize environmental and economic impacts. Traditional approaches are often time-consuming, subjective, and limited in their ability to provide accurate predictions. This study introduces a novel methodology that integrates geochemical data analysis with machine learning techniques to enhance the identification of oil spill origins. A dataset comprising 2200 presalt oil samples and 75 attributes from the Santos Basin underwent preprocessing and exploratory analysis, resulting in 2137 samples and 62 predictive attributes. Seven machine learning algorithms were evaluated, with the random forest model achieving the highest classification accuracy of 91%. The methodology was validated using three independent oil samples (spill events and one natural seep), demonstrating its robustness in accurately predicting field origins with high confidence. The integration of machine learning techniques and geochemical analysis reduced the subjectivity of human interpretation, significantly accelerated diagnostic workflows, and provided reliable results in minutes. This approach represents a scalable and innovative solution for both exploratory and forensic geochemistry, particularly in complex production areas along the Brazilian coast. The proposed methodology has the potential to enhance decision-making processes in environmental monitoring and oil exploration.https://doi.org/10.1038/s41598-025-00084-5
spellingShingle Gil Marcio Avelino Silva
Fernando Pellon de Miranda
Jarbas Vicente Poley Guzzo
Wagner Leonel Bastos
Ygor Rocha
Igor Viegas Alves Fernandes de Souza
Italo Oliveira Matias
Sarah Barron Torres
Francisco Fabio de Araujo Ponte
Application of machine learning in forensic geochemistry using presalt oil samples from the Santos basin
Scientific Reports
title Application of machine learning in forensic geochemistry using presalt oil samples from the Santos basin
title_full Application of machine learning in forensic geochemistry using presalt oil samples from the Santos basin
title_fullStr Application of machine learning in forensic geochemistry using presalt oil samples from the Santos basin
title_full_unstemmed Application of machine learning in forensic geochemistry using presalt oil samples from the Santos basin
title_short Application of machine learning in forensic geochemistry using presalt oil samples from the Santos basin
title_sort application of machine learning in forensic geochemistry using presalt oil samples from the santos basin
url https://doi.org/10.1038/s41598-025-00084-5
work_keys_str_mv AT gilmarcioavelinosilva applicationofmachinelearninginforensicgeochemistryusingpresaltoilsamplesfromthesantosbasin
AT fernandopellondemiranda applicationofmachinelearninginforensicgeochemistryusingpresaltoilsamplesfromthesantosbasin
AT jarbasvicentepoleyguzzo applicationofmachinelearninginforensicgeochemistryusingpresaltoilsamplesfromthesantosbasin
AT wagnerleonelbastos applicationofmachinelearninginforensicgeochemistryusingpresaltoilsamplesfromthesantosbasin
AT ygorrocha applicationofmachinelearninginforensicgeochemistryusingpresaltoilsamplesfromthesantosbasin
AT igorviegasalvesfernandesdesouza applicationofmachinelearninginforensicgeochemistryusingpresaltoilsamplesfromthesantosbasin
AT italooliveiramatias applicationofmachinelearninginforensicgeochemistryusingpresaltoilsamplesfromthesantosbasin
AT sarahbarrontorres applicationofmachinelearninginforensicgeochemistryusingpresaltoilsamplesfromthesantosbasin
AT franciscofabiodearaujoponte applicationofmachinelearninginforensicgeochemistryusingpresaltoilsamplesfromthesantosbasin