A Review on Optimal Subsampling Method

Optimal subsampling method is an efficient method for massive data because it can not only downsize the data amount but also save computational time. Subsampling methods have been essential to statistical analysis throughout history. In this article, we discuss several prominent subsampling methods,...

Full description

Saved in:
Bibliographic Details
Main Author: Xu Yuan
Format: Article
Language:English
Published: EDP Sciences 2025-01-01
Series:BIO Web of Conferences
Online Access:https://www.bio-conferences.org/articles/bioconf/pdf/2025/25/bioconf_icbb2025_03001.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Optimal subsampling method is an efficient method for massive data because it can not only downsize the data amount but also save computational time. Subsampling methods have been essential to statistical analysis throughout history. In this article, we discuss several prominent subsampling methods, including subsampling based on leverage, A/L-optimality criterion, D- optimality criterion and Poisson subsampling strategy. For linear models, we find that the leverage is simple to apply. Subsampling based on A/L-optimality serves as a general approach applicable to numerous models, such as generalized linear models along with linear models. It is only in the case of linear models that subsampling founded on D-optimality proves to be more effective in comparison to other methods. When contrasted with subsampling with replacement, poisson sampling emerges as a more efficient subsampling technique, demanding less memory and taking up less processing time.
ISSN:2117-4458