Satellite Image Price Prediction Based on Machine Learning

This study develops a comprehensive, data-driven framework for predicting satellite imagery prices using four state-of-the-art ensemble learning algorithms: XGBoost, LightGBM, AdaBoost, and CatBoost. Two distinct datasets—optical and Synthetic Aperture Radar (SAR) imagery—were assembled, each charac...

Full description

Saved in:
Bibliographic Details
Main Authors: Linhan Yang, Zugang Chen, Guoqing Li
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/17/12/1960
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849425297995202560
author Linhan Yang
Zugang Chen
Guoqing Li
author_facet Linhan Yang
Zugang Chen
Guoqing Li
author_sort Linhan Yang
collection DOAJ
description This study develops a comprehensive, data-driven framework for predicting satellite imagery prices using four state-of-the-art ensemble learning algorithms: XGBoost, LightGBM, AdaBoost, and CatBoost. Two distinct datasets—optical and Synthetic Aperture Radar (SAR) imagery—were assembled, each characterized by nine technical and economic features (e.g., imaging mode, spatial resolution, satellite manufacturing cost, and acquisition timeliness). Bayesian optimization is employed to systematically tune hyperparameters, thereby minimizing overfitting and maximizing generalization. Models are evaluated on held-out test sets (20% of data) using Pearson’s correlation coefficient (<i>R</i>), mean bias error (MBE), root mean square error (RMSE), unbiased RMSE (ubRMSE), Nash–Sutcliffe Efficiency (NSE), and Kling–Gupta Efficiency (KGE). For optical imagery, the Bayesian-optimized XGBoost model achieves the best performance (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>R</mi><mo>=</mo><mn>0.9870</mn></mrow></semantics></math></inline-formula>, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>RMSE</mi><mo>=</mo><mi>$</mi><mn>3.44</mn><mo>/</mo><msup><mi>km</mi><mn>2</mn></msup></mrow></semantics></math></inline-formula>, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>NSE</mi><mo>=</mo><mn>0.9651</mn></mrow></semantics></math></inline-formula>, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>KGE</mi><mo>=</mo><mn>0.8950</mn></mrow></semantics></math></inline-formula>), followed closely by CatBoost (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>R</mi><mo>=</mo><mn>0.9826</mn></mrow></semantics></math></inline-formula>, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>RMSE</mi><mo>=</mo><mi>$</mi><mn>3.83</mn><mo>/</mo><msup><mi>km</mi><mn>2</mn></msup></mrow></semantics></math></inline-formula>). For SAR imagery, CatBoost outperforms all others after optimization (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>R</mi><mo>=</mo><mn>0.9278</mn></mrow></semantics></math></inline-formula>, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>RMSE</mi><mo>=</mo><mi>$</mi><mn>9.94</mn><mo>/</mo><msup><mi>km</mi><mn>2</mn></msup></mrow></semantics></math></inline-formula>, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>NSE</mi><mo>=</mo><mn>0.8575</mn></mrow></semantics></math></inline-formula>, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>KGE</mi><mo>=</mo><mn>0.8443</mn></mrow></semantics></math></inline-formula>), reflecting its robustness to heavy-tailed price distributions. AdaBoost also demonstrates competitive accuracy, while LightGBM and XGBoost exhibit larger errors in high-value regimes. SHapley Additive exPlanations (SHAP) analysis reveals that imaging mode and spatial resolution are the primary drivers of price variance across both domains, followed by satellite manufacturing cost and acquisition recency. These insights demonstrate how ensemble models capture nonlinear, high-dimensional interactions that traditional rule-based pricing schemes overlook. Compared to static, experience-driven price brackets, our machine learning approach provides a scalable, transparent, and economically rational pricing engine—adaptable to rapidly changing market conditions and capable of supporting fine-grained, application-specific pricing strategies.
format Article
id doaj-art-591ae5cc710b4c41a8f10e2143725614
institution Kabale University
issn 2072-4292
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj-art-591ae5cc710b4c41a8f10e21437256142025-08-20T03:29:48ZengMDPI AGRemote Sensing2072-42922025-06-011712196010.3390/rs17121960Satellite Image Price Prediction Based on Machine LearningLinhan Yang0Zugang Chen1Guoqing Li2Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, ChinaThis study develops a comprehensive, data-driven framework for predicting satellite imagery prices using four state-of-the-art ensemble learning algorithms: XGBoost, LightGBM, AdaBoost, and CatBoost. Two distinct datasets—optical and Synthetic Aperture Radar (SAR) imagery—were assembled, each characterized by nine technical and economic features (e.g., imaging mode, spatial resolution, satellite manufacturing cost, and acquisition timeliness). Bayesian optimization is employed to systematically tune hyperparameters, thereby minimizing overfitting and maximizing generalization. Models are evaluated on held-out test sets (20% of data) using Pearson’s correlation coefficient (<i>R</i>), mean bias error (MBE), root mean square error (RMSE), unbiased RMSE (ubRMSE), Nash–Sutcliffe Efficiency (NSE), and Kling–Gupta Efficiency (KGE). For optical imagery, the Bayesian-optimized XGBoost model achieves the best performance (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>R</mi><mo>=</mo><mn>0.9870</mn></mrow></semantics></math></inline-formula>, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>RMSE</mi><mo>=</mo><mi>$</mi><mn>3.44</mn><mo>/</mo><msup><mi>km</mi><mn>2</mn></msup></mrow></semantics></math></inline-formula>, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>NSE</mi><mo>=</mo><mn>0.9651</mn></mrow></semantics></math></inline-formula>, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>KGE</mi><mo>=</mo><mn>0.8950</mn></mrow></semantics></math></inline-formula>), followed closely by CatBoost (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>R</mi><mo>=</mo><mn>0.9826</mn></mrow></semantics></math></inline-formula>, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>RMSE</mi><mo>=</mo><mi>$</mi><mn>3.83</mn><mo>/</mo><msup><mi>km</mi><mn>2</mn></msup></mrow></semantics></math></inline-formula>). For SAR imagery, CatBoost outperforms all others after optimization (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>R</mi><mo>=</mo><mn>0.9278</mn></mrow></semantics></math></inline-formula>, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>RMSE</mi><mo>=</mo><mi>$</mi><mn>9.94</mn><mo>/</mo><msup><mi>km</mi><mn>2</mn></msup></mrow></semantics></math></inline-formula>, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>NSE</mi><mo>=</mo><mn>0.8575</mn></mrow></semantics></math></inline-formula>, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>KGE</mi><mo>=</mo><mn>0.8443</mn></mrow></semantics></math></inline-formula>), reflecting its robustness to heavy-tailed price distributions. AdaBoost also demonstrates competitive accuracy, while LightGBM and XGBoost exhibit larger errors in high-value regimes. SHapley Additive exPlanations (SHAP) analysis reveals that imaging mode and spatial resolution are the primary drivers of price variance across both domains, followed by satellite manufacturing cost and acquisition recency. These insights demonstrate how ensemble models capture nonlinear, high-dimensional interactions that traditional rule-based pricing schemes overlook. Compared to static, experience-driven price brackets, our machine learning approach provides a scalable, transparent, and economically rational pricing engine—adaptable to rapidly changing market conditions and capable of supporting fine-grained, application-specific pricing strategies.https://www.mdpi.com/2072-4292/17/12/1960satellite imagery pricingensemble learningBayesian optimizationSHAP analysis
spellingShingle Linhan Yang
Zugang Chen
Guoqing Li
Satellite Image Price Prediction Based on Machine Learning
Remote Sensing
satellite imagery pricing
ensemble learning
Bayesian optimization
SHAP analysis
title Satellite Image Price Prediction Based on Machine Learning
title_full Satellite Image Price Prediction Based on Machine Learning
title_fullStr Satellite Image Price Prediction Based on Machine Learning
title_full_unstemmed Satellite Image Price Prediction Based on Machine Learning
title_short Satellite Image Price Prediction Based on Machine Learning
title_sort satellite image price prediction based on machine learning
topic satellite imagery pricing
ensemble learning
Bayesian optimization
SHAP analysis
url https://www.mdpi.com/2072-4292/17/12/1960
work_keys_str_mv AT linhanyang satelliteimagepricepredictionbasedonmachinelearning
AT zugangchen satelliteimagepricepredictionbasedonmachinelearning
AT guoqingli satelliteimagepricepredictionbasedonmachinelearning