Sample Size Impact (SaSii): An R script for estimating optimal sample sizes in population genetics and population genomics studies.

Obtaining large sample sizes for genetic studies can be challenging, time-consuming, and expensive, and small sample sizes may generate biased or imprecise results. Many studies have suggested the minimum sample size necessary to obtain robust and reliable results, but it is not possible to define o...

Full description

Saved in:
Bibliographic Details
Main Authors: Matheus Scaketti, Patricia Sanae Sujii, Alessandro Alves-Pereira, Kaiser Dias Schwarcz, Ana Flávia Francisconi, Matheus Sartori Moro, Kauanne Karolline Moreno Martins, Thiago Araujo de Jesus, Guilherme Brener Ferreira de Souza, Maria Imaculada Zucchi
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0316634
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Obtaining large sample sizes for genetic studies can be challenging, time-consuming, and expensive, and small sample sizes may generate biased or imprecise results. Many studies have suggested the minimum sample size necessary to obtain robust and reliable results, but it is not possible to define one ideal minimum sample size that fits all studies. Here, we present SaSii (Sample Size Impact), an R script to help researchers define the minimum sample size. Based on empirical and simulated data analysis using SaSii, we present patterns and suggest minimum sample sizes for experiment design. The patterns were obtained by analyzing previously published genotype datasets with SaSii and can be used as a starting point for the sample design of population genetics and genomic studies. Our results showed that it is possible to estimate an adequate sample size that accurately represents the real population without requiring the scientist to write any program code, extract and sequence samples, or use population genetics programs, thus simplifying the process. We also confirmed that the minimum sample sizes for SNP (single-nucleotide polymorphism) analysis are usually smaller than for SSR (simple sequence repeat) analysis and discussed other patterns observed from empirical plant and animal datasets.
ISSN:1932-6203