Sample Size Impact (SaSii): An R script for estimating optimal sample sizes in population genetics and population genomics studies.

Obtaining large sample sizes for genetic studies can be challenging, time-consuming, and expensive, and small sample sizes may generate biased or imprecise results. Many studies have suggested the minimum sample size necessary to obtain robust and reliable results, but it is not possible to define o...

Full description

Saved in:
Bibliographic Details
Main Authors: Matheus Scaketti, Patricia Sanae Sujii, Alessandro Alves-Pereira, Kaiser Dias Schwarcz, Ana Flávia Francisconi, Matheus Sartori Moro, Kauanne Karolline Moreno Martins, Thiago Araujo de Jesus, Guilherme Brener Ferreira de Souza, Maria Imaculada Zucchi
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0316634
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850040520143798272
author Matheus Scaketti
Patricia Sanae Sujii
Alessandro Alves-Pereira
Kaiser Dias Schwarcz
Ana Flávia Francisconi
Matheus Sartori Moro
Kauanne Karolline Moreno Martins
Thiago Araujo de Jesus
Guilherme Brener Ferreira de Souza
Maria Imaculada Zucchi
author_facet Matheus Scaketti
Patricia Sanae Sujii
Alessandro Alves-Pereira
Kaiser Dias Schwarcz
Ana Flávia Francisconi
Matheus Sartori Moro
Kauanne Karolline Moreno Martins
Thiago Araujo de Jesus
Guilherme Brener Ferreira de Souza
Maria Imaculada Zucchi
author_sort Matheus Scaketti
collection DOAJ
description Obtaining large sample sizes for genetic studies can be challenging, time-consuming, and expensive, and small sample sizes may generate biased or imprecise results. Many studies have suggested the minimum sample size necessary to obtain robust and reliable results, but it is not possible to define one ideal minimum sample size that fits all studies. Here, we present SaSii (Sample Size Impact), an R script to help researchers define the minimum sample size. Based on empirical and simulated data analysis using SaSii, we present patterns and suggest minimum sample sizes for experiment design. The patterns were obtained by analyzing previously published genotype datasets with SaSii and can be used as a starting point for the sample design of population genetics and genomic studies. Our results showed that it is possible to estimate an adequate sample size that accurately represents the real population without requiring the scientist to write any program code, extract and sequence samples, or use population genetics programs, thus simplifying the process. We also confirmed that the minimum sample sizes for SNP (single-nucleotide polymorphism) analysis are usually smaller than for SSR (simple sequence repeat) analysis and discussed other patterns observed from empirical plant and animal datasets.
format Article
id doaj-art-d12a99f33b1b4cb49f5a5e8f1e7c6c5e
institution DOAJ
issn 1932-6203
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-d12a99f33b1b4cb49f5a5e8f1e7c6c5e2025-08-20T02:56:03ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01202e031663410.1371/journal.pone.0316634Sample Size Impact (SaSii): An R script for estimating optimal sample sizes in population genetics and population genomics studies.Matheus ScakettiPatricia Sanae SujiiAlessandro Alves-PereiraKaiser Dias SchwarczAna Flávia FrancisconiMatheus Sartori MoroKauanne Karolline Moreno MartinsThiago Araujo de JesusGuilherme Brener Ferreira de SouzaMaria Imaculada ZucchiObtaining large sample sizes for genetic studies can be challenging, time-consuming, and expensive, and small sample sizes may generate biased or imprecise results. Many studies have suggested the minimum sample size necessary to obtain robust and reliable results, but it is not possible to define one ideal minimum sample size that fits all studies. Here, we present SaSii (Sample Size Impact), an R script to help researchers define the minimum sample size. Based on empirical and simulated data analysis using SaSii, we present patterns and suggest minimum sample sizes for experiment design. The patterns were obtained by analyzing previously published genotype datasets with SaSii and can be used as a starting point for the sample design of population genetics and genomic studies. Our results showed that it is possible to estimate an adequate sample size that accurately represents the real population without requiring the scientist to write any program code, extract and sequence samples, or use population genetics programs, thus simplifying the process. We also confirmed that the minimum sample sizes for SNP (single-nucleotide polymorphism) analysis are usually smaller than for SSR (simple sequence repeat) analysis and discussed other patterns observed from empirical plant and animal datasets.https://doi.org/10.1371/journal.pone.0316634
spellingShingle Matheus Scaketti
Patricia Sanae Sujii
Alessandro Alves-Pereira
Kaiser Dias Schwarcz
Ana Flávia Francisconi
Matheus Sartori Moro
Kauanne Karolline Moreno Martins
Thiago Araujo de Jesus
Guilherme Brener Ferreira de Souza
Maria Imaculada Zucchi
Sample Size Impact (SaSii): An R script for estimating optimal sample sizes in population genetics and population genomics studies.
PLoS ONE
title Sample Size Impact (SaSii): An R script for estimating optimal sample sizes in population genetics and population genomics studies.
title_full Sample Size Impact (SaSii): An R script for estimating optimal sample sizes in population genetics and population genomics studies.
title_fullStr Sample Size Impact (SaSii): An R script for estimating optimal sample sizes in population genetics and population genomics studies.
title_full_unstemmed Sample Size Impact (SaSii): An R script for estimating optimal sample sizes in population genetics and population genomics studies.
title_short Sample Size Impact (SaSii): An R script for estimating optimal sample sizes in population genetics and population genomics studies.
title_sort sample size impact sasii an r script for estimating optimal sample sizes in population genetics and population genomics studies
url https://doi.org/10.1371/journal.pone.0316634
work_keys_str_mv AT matheusscaketti samplesizeimpactsasiianrscriptforestimatingoptimalsamplesizesinpopulationgeneticsandpopulationgenomicsstudies
AT patriciasanaesujii samplesizeimpactsasiianrscriptforestimatingoptimalsamplesizesinpopulationgeneticsandpopulationgenomicsstudies
AT alessandroalvespereira samplesizeimpactsasiianrscriptforestimatingoptimalsamplesizesinpopulationgeneticsandpopulationgenomicsstudies
AT kaiserdiasschwarcz samplesizeimpactsasiianrscriptforestimatingoptimalsamplesizesinpopulationgeneticsandpopulationgenomicsstudies
AT anaflaviafrancisconi samplesizeimpactsasiianrscriptforestimatingoptimalsamplesizesinpopulationgeneticsandpopulationgenomicsstudies
AT matheussartorimoro samplesizeimpactsasiianrscriptforestimatingoptimalsamplesizesinpopulationgeneticsandpopulationgenomicsstudies
AT kauannekarollinemorenomartins samplesizeimpactsasiianrscriptforestimatingoptimalsamplesizesinpopulationgeneticsandpopulationgenomicsstudies
AT thiagoaraujodejesus samplesizeimpactsasiianrscriptforestimatingoptimalsamplesizesinpopulationgeneticsandpopulationgenomicsstudies
AT guilhermebrenerferreiradesouza samplesizeimpactsasiianrscriptforestimatingoptimalsamplesizesinpopulationgeneticsandpopulationgenomicsstudies
AT mariaimaculadazucchi samplesizeimpactsasiianrscriptforestimatingoptimalsamplesizesinpopulationgeneticsandpopulationgenomicsstudies