What can we infer about mutation calling by using time‐series mutation accumulation data and a Bayesian Mutation Finder?

Abstract Accurate estimates of mutation rates derived from genome‐wide mutation accumulation (MA) data are fundamental to understanding basic evolutionary processes. The rapidly improving high‐throughput sequencing technologies provide unprecedented opportunities to identify single nucleotide mutati...

Full description

Saved in:
Bibliographic Details
Main Authors: Takahiro Maruki, April Ozere, Jack Freeman, Melania E. Cristescu
Format: Article
Language:English
Published: Wiley 2024-11-01
Series:Ecology and Evolution
Subjects:
Online Access:https://doi.org/10.1002/ece3.70339
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850053015594074112
author Takahiro Maruki
April Ozere
Jack Freeman
Melania E. Cristescu
author_facet Takahiro Maruki
April Ozere
Jack Freeman
Melania E. Cristescu
author_sort Takahiro Maruki
collection DOAJ
description Abstract Accurate estimates of mutation rates derived from genome‐wide mutation accumulation (MA) data are fundamental to understanding basic evolutionary processes. The rapidly improving high‐throughput sequencing technologies provide unprecedented opportunities to identify single nucleotide mutations across genomes. However, such MA derived data are often difficult to analyze and the performance of the available methods of analysis is not well understood. In this study, we used the existing Bayesian Genotype Caller adapted for MA data that we refer to as Bayesian Mutation Finder (BMF) for identifying single nucleotide mutations while considering the characteristics of the data. We compared the performance of BMF with the widely used Genome Analysis Toolkit (GATK) by applying these two methods to time‐series MA data as well as simulated data. The time‐series data were obtained by propagating Daphnia pulex over an average of 188 generations and performing whole‐genome sequencing of 14 MA lines across three time points. The results indicate that BMF enables more accurate identification of single nucleotide mutations than GATK especially when applied to the empirical data. Furthermore, BMF involves the use of fewer parameters and is more computationally efficient than GATK. Both BMF and GATK found surprisingly many candidate mutations that were not confirmed at later time points. We systematically infer causes of the unconfirmed candidate mutations, introduce a framework for estimating mutation rates based on genome‐wide candidate mutations confirmed by subsequent sequencing, and provide an improved mutation rate estimate for D. pulex.
format Article
id doaj-art-669a71d9bf124910874f72e2ec526b5a
institution DOAJ
issn 2045-7758
language English
publishDate 2024-11-01
publisher Wiley
record_format Article
series Ecology and Evolution
spelling doaj-art-669a71d9bf124910874f72e2ec526b5a2025-08-20T02:52:38ZengWileyEcology and Evolution2045-77582024-11-011411n/an/a10.1002/ece3.70339What can we infer about mutation calling by using time‐series mutation accumulation data and a Bayesian Mutation Finder?Takahiro Maruki0April Ozere1Jack Freeman2Melania E. Cristescu3Department of Biology McGill University Montreal Quebec CanadaDepartment of Biology McGill University Montreal Quebec CanadaDepartment of Biology McGill University Montreal Quebec CanadaDepartment of Biology McGill University Montreal Quebec CanadaAbstract Accurate estimates of mutation rates derived from genome‐wide mutation accumulation (MA) data are fundamental to understanding basic evolutionary processes. The rapidly improving high‐throughput sequencing technologies provide unprecedented opportunities to identify single nucleotide mutations across genomes. However, such MA derived data are often difficult to analyze and the performance of the available methods of analysis is not well understood. In this study, we used the existing Bayesian Genotype Caller adapted for MA data that we refer to as Bayesian Mutation Finder (BMF) for identifying single nucleotide mutations while considering the characteristics of the data. We compared the performance of BMF with the widely used Genome Analysis Toolkit (GATK) by applying these two methods to time‐series MA data as well as simulated data. The time‐series data were obtained by propagating Daphnia pulex over an average of 188 generations and performing whole‐genome sequencing of 14 MA lines across three time points. The results indicate that BMF enables more accurate identification of single nucleotide mutations than GATK especially when applied to the empirical data. Furthermore, BMF involves the use of fewer parameters and is more computationally efficient than GATK. Both BMF and GATK found surprisingly many candidate mutations that were not confirmed at later time points. We systematically infer causes of the unconfirmed candidate mutations, introduce a framework for estimating mutation rates based on genome‐wide candidate mutations confirmed by subsequent sequencing, and provide an improved mutation rate estimate for D. pulex.https://doi.org/10.1002/ece3.70339Bayesian Mutation FinderDaphnia pulexmutation ratesingle nucleotide mutationstime‐series mutation accumulation data
spellingShingle Takahiro Maruki
April Ozere
Jack Freeman
Melania E. Cristescu
What can we infer about mutation calling by using time‐series mutation accumulation data and a Bayesian Mutation Finder?
Ecology and Evolution
Bayesian Mutation Finder
Daphnia pulex
mutation rate
single nucleotide mutations
time‐series mutation accumulation data
title What can we infer about mutation calling by using time‐series mutation accumulation data and a Bayesian Mutation Finder?
title_full What can we infer about mutation calling by using time‐series mutation accumulation data and a Bayesian Mutation Finder?
title_fullStr What can we infer about mutation calling by using time‐series mutation accumulation data and a Bayesian Mutation Finder?
title_full_unstemmed What can we infer about mutation calling by using time‐series mutation accumulation data and a Bayesian Mutation Finder?
title_short What can we infer about mutation calling by using time‐series mutation accumulation data and a Bayesian Mutation Finder?
title_sort what can we infer about mutation calling by using time series mutation accumulation data and a bayesian mutation finder
topic Bayesian Mutation Finder
Daphnia pulex
mutation rate
single nucleotide mutations
time‐series mutation accumulation data
url https://doi.org/10.1002/ece3.70339
work_keys_str_mv AT takahiromaruki whatcanweinferaboutmutationcallingbyusingtimeseriesmutationaccumulationdataandabayesianmutationfinder
AT aprilozere whatcanweinferaboutmutationcallingbyusingtimeseriesmutationaccumulationdataandabayesianmutationfinder
AT jackfreeman whatcanweinferaboutmutationcallingbyusingtimeseriesmutationaccumulationdataandabayesianmutationfinder
AT melaniaecristescu whatcanweinferaboutmutationcallingbyusingtimeseriesmutationaccumulationdataandabayesianmutationfinder