Exploring the Habitat Distribution of <i>Decapterus macarellus</i> in the South China Sea Under Varying Spatial Resolutions: A Combined Approach Using Multiple Machine Learning and the MaxEnt Model

The selection of environmental variables with different spatial resolutions is a critical factor affecting the accuracy of machine learning-based fishery forecasting. In this study, spring-season survey data of <i>Decapterus macarellus</i> in the South China Sea from 2016 to 2024 were us...

Full description

Saved in:
Bibliographic Details
Main Authors: Qikun Shen, Peng Zhang, Xue Feng, Zuozhi Chen, Jiangtao Fan
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Biology
Subjects:
Online Access:https://www.mdpi.com/2079-7737/14/7/753
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849418164305133568
author Qikun Shen
Peng Zhang
Xue Feng
Zuozhi Chen
Jiangtao Fan
author_facet Qikun Shen
Peng Zhang
Xue Feng
Zuozhi Chen
Jiangtao Fan
author_sort Qikun Shen
collection DOAJ
description The selection of environmental variables with different spatial resolutions is a critical factor affecting the accuracy of machine learning-based fishery forecasting. In this study, spring-season survey data of <i>Decapterus macarellus</i> in the South China Sea from 2016 to 2024 were used to construct six machine learning models—decision tree (DT), extra trees (ETs), K-Nearest Neighbors (KNN), light gradient boosting machine (LGBM), random forest (RF), and extreme gradient boosting (XGB)—based on seven environmental variables (e.g., sea surface temperature (SST), chlorophyll-a concentration (CHL)) at four spatial resolutions (0.083°, 0.25°, 0.5°, and 1°), filtered using Pearson correlation analysis. Optimal models were selected under each resolution through performance comparison. SHapley Additive exPlanations (SHAP) values were employed to interpret the contribution of environmental predictors, and the maximum entropy (MaxEnt) model was used to perform habitat suitability mapping. Results showed that the XGB model at 0.083° resolution achieved the best performance, with the area under the receiver operating characteristic curve (ROC_AUC) = 0.836, accuracy = 0.793, and negative predictive value = 0.862, outperforming models at coarser resolutions. CHL was identified as the most influential variable, showing high importance in both the SHAP distribution and the cumulative area under the curve contribution. Predicted suitable habitats were mainly located in the northern and central-southern South China Sea, with the latter covering a broader area. This study is the first to systematically evaluate the impact of spatial resolution on environmental variable selection in machine learning models, integrating SHAP-based interpretability with MaxEnt modeling to achieve reliable habitat suitability prediction, offering valuable insights for fishery forecasting in the South China Sea.
format Article
id doaj-art-b557273a21b34ff1b6b58338660c8a9a
institution Kabale University
issn 2079-7737
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Biology
spelling doaj-art-b557273a21b34ff1b6b58338660c8a9a2025-08-20T03:32:31ZengMDPI AGBiology2079-77372025-06-0114775310.3390/biology14070753Exploring the Habitat Distribution of <i>Decapterus macarellus</i> in the South China Sea Under Varying Spatial Resolutions: A Combined Approach Using Multiple Machine Learning and the MaxEnt ModelQikun Shen0Peng Zhang1Xue Feng2Zuozhi Chen3Jiangtao Fan4South China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou 510300, ChinaSouth China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou 510300, ChinaSouth China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou 510300, ChinaSouth China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou 510300, ChinaSouth China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou 510300, ChinaThe selection of environmental variables with different spatial resolutions is a critical factor affecting the accuracy of machine learning-based fishery forecasting. In this study, spring-season survey data of <i>Decapterus macarellus</i> in the South China Sea from 2016 to 2024 were used to construct six machine learning models—decision tree (DT), extra trees (ETs), K-Nearest Neighbors (KNN), light gradient boosting machine (LGBM), random forest (RF), and extreme gradient boosting (XGB)—based on seven environmental variables (e.g., sea surface temperature (SST), chlorophyll-a concentration (CHL)) at four spatial resolutions (0.083°, 0.25°, 0.5°, and 1°), filtered using Pearson correlation analysis. Optimal models were selected under each resolution through performance comparison. SHapley Additive exPlanations (SHAP) values were employed to interpret the contribution of environmental predictors, and the maximum entropy (MaxEnt) model was used to perform habitat suitability mapping. Results showed that the XGB model at 0.083° resolution achieved the best performance, with the area under the receiver operating characteristic curve (ROC_AUC) = 0.836, accuracy = 0.793, and negative predictive value = 0.862, outperforming models at coarser resolutions. CHL was identified as the most influential variable, showing high importance in both the SHAP distribution and the cumulative area under the curve contribution. Predicted suitable habitats were mainly located in the northern and central-southern South China Sea, with the latter covering a broader area. This study is the first to systematically evaluate the impact of spatial resolution on environmental variable selection in machine learning models, integrating SHAP-based interpretability with MaxEnt modeling to achieve reliable habitat suitability prediction, offering valuable insights for fishery forecasting in the South China Sea.https://www.mdpi.com/2079-7737/14/7/753<i>Decapterus macarellus</i>spatial resolutionmachine learningSHAPMaxEnt
spellingShingle Qikun Shen
Peng Zhang
Xue Feng
Zuozhi Chen
Jiangtao Fan
Exploring the Habitat Distribution of <i>Decapterus macarellus</i> in the South China Sea Under Varying Spatial Resolutions: A Combined Approach Using Multiple Machine Learning and the MaxEnt Model
Biology
<i>Decapterus macarellus</i>
spatial resolution
machine learning
SHAP
MaxEnt
title Exploring the Habitat Distribution of <i>Decapterus macarellus</i> in the South China Sea Under Varying Spatial Resolutions: A Combined Approach Using Multiple Machine Learning and the MaxEnt Model
title_full Exploring the Habitat Distribution of <i>Decapterus macarellus</i> in the South China Sea Under Varying Spatial Resolutions: A Combined Approach Using Multiple Machine Learning and the MaxEnt Model
title_fullStr Exploring the Habitat Distribution of <i>Decapterus macarellus</i> in the South China Sea Under Varying Spatial Resolutions: A Combined Approach Using Multiple Machine Learning and the MaxEnt Model
title_full_unstemmed Exploring the Habitat Distribution of <i>Decapterus macarellus</i> in the South China Sea Under Varying Spatial Resolutions: A Combined Approach Using Multiple Machine Learning and the MaxEnt Model
title_short Exploring the Habitat Distribution of <i>Decapterus macarellus</i> in the South China Sea Under Varying Spatial Resolutions: A Combined Approach Using Multiple Machine Learning and the MaxEnt Model
title_sort exploring the habitat distribution of i decapterus macarellus i in the south china sea under varying spatial resolutions a combined approach using multiple machine learning and the maxent model
topic <i>Decapterus macarellus</i>
spatial resolution
machine learning
SHAP
MaxEnt
url https://www.mdpi.com/2079-7737/14/7/753
work_keys_str_mv AT qikunshen exploringthehabitatdistributionofidecapterusmacarellusiinthesouthchinaseaundervaryingspatialresolutionsacombinedapproachusingmultiplemachinelearningandthemaxentmodel
AT pengzhang exploringthehabitatdistributionofidecapterusmacarellusiinthesouthchinaseaundervaryingspatialresolutionsacombinedapproachusingmultiplemachinelearningandthemaxentmodel
AT xuefeng exploringthehabitatdistributionofidecapterusmacarellusiinthesouthchinaseaundervaryingspatialresolutionsacombinedapproachusingmultiplemachinelearningandthemaxentmodel
AT zuozhichen exploringthehabitatdistributionofidecapterusmacarellusiinthesouthchinaseaundervaryingspatialresolutionsacombinedapproachusingmultiplemachinelearningandthemaxentmodel
AT jiangtaofan exploringthehabitatdistributionofidecapterusmacarellusiinthesouthchinaseaundervaryingspatialresolutionsacombinedapproachusingmultiplemachinelearningandthemaxentmodel