A Review on Optimal Subsampling Method

Optimal subsampling method is an efficient method for massive data because it can not only downsize the data amount but also save computational time. Subsampling methods have been essential to statistical analysis throughout history. In this article, we discuss several prominent subsampling methods,...

Full description

Saved in:
Bibliographic Details
Main Author: Xu Yuan
Format: Article
Language:English
Published: EDP Sciences 2025-01-01
Series:BIO Web of Conferences
Online Access:https://www.bio-conferences.org/articles/bioconf/pdf/2025/25/bioconf_icbb2025_03001.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850152320681115648
author Xu Yuan
author_facet Xu Yuan
author_sort Xu Yuan
collection DOAJ
description Optimal subsampling method is an efficient method for massive data because it can not only downsize the data amount but also save computational time. Subsampling methods have been essential to statistical analysis throughout history. In this article, we discuss several prominent subsampling methods, including subsampling based on leverage, A/L-optimality criterion, D- optimality criterion and Poisson subsampling strategy. For linear models, we find that the leverage is simple to apply. Subsampling based on A/L-optimality serves as a general approach applicable to numerous models, such as generalized linear models along with linear models. It is only in the case of linear models that subsampling founded on D-optimality proves to be more effective in comparison to other methods. When contrasted with subsampling with replacement, poisson sampling emerges as a more efficient subsampling technique, demanding less memory and taking up less processing time.
format Article
id doaj-art-745e8329155e4ca2b39c04df7b34b059
institution OA Journals
issn 2117-4458
language English
publishDate 2025-01-01
publisher EDP Sciences
record_format Article
series BIO Web of Conferences
spelling doaj-art-745e8329155e4ca2b39c04df7b34b0592025-08-20T02:26:01ZengEDP SciencesBIO Web of Conferences2117-44582025-01-011740300110.1051/bioconf/202517403001bioconf_icbb2025_03001A Review on Optimal Subsampling MethodXu Yuan0The High School Affiliated to Renmin University of ChinaOptimal subsampling method is an efficient method for massive data because it can not only downsize the data amount but also save computational time. Subsampling methods have been essential to statistical analysis throughout history. In this article, we discuss several prominent subsampling methods, including subsampling based on leverage, A/L-optimality criterion, D- optimality criterion and Poisson subsampling strategy. For linear models, we find that the leverage is simple to apply. Subsampling based on A/L-optimality serves as a general approach applicable to numerous models, such as generalized linear models along with linear models. It is only in the case of linear models that subsampling founded on D-optimality proves to be more effective in comparison to other methods. When contrasted with subsampling with replacement, poisson sampling emerges as a more efficient subsampling technique, demanding less memory and taking up less processing time.https://www.bio-conferences.org/articles/bioconf/pdf/2025/25/bioconf_icbb2025_03001.pdf
spellingShingle Xu Yuan
A Review on Optimal Subsampling Method
BIO Web of Conferences
title A Review on Optimal Subsampling Method
title_full A Review on Optimal Subsampling Method
title_fullStr A Review on Optimal Subsampling Method
title_full_unstemmed A Review on Optimal Subsampling Method
title_short A Review on Optimal Subsampling Method
title_sort review on optimal subsampling method
url https://www.bio-conferences.org/articles/bioconf/pdf/2025/25/bioconf_icbb2025_03001.pdf
work_keys_str_mv AT xuyuan areviewonoptimalsubsamplingmethod
AT xuyuan reviewonoptimalsubsamplingmethod