ReproPhylo: An Environment for Reproducible Phylogenomics.

The reproducibility of experiments is key to the scientific process, and particularly necessary for accurate reporting of analyses in data-rich fields such as phylogenomics. We present ReproPhylo, a phylogenomic analysis environment developed to ensure experimental reproducibility, to facilitate the...

Full description

Saved in:
Bibliographic Details
Main Authors: Amir Szitenberg, Max John, Mark L Blaxter, David H Lunt
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2015-09-01
Series:PLoS Computational Biology
Online Access:https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1004447&type=printable
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850162127701016576
author Amir Szitenberg
Max John
Mark L Blaxter
David H Lunt
author_facet Amir Szitenberg
Max John
Mark L Blaxter
David H Lunt
author_sort Amir Szitenberg
collection DOAJ
description The reproducibility of experiments is key to the scientific process, and particularly necessary for accurate reporting of analyses in data-rich fields such as phylogenomics. We present ReproPhylo, a phylogenomic analysis environment developed to ensure experimental reproducibility, to facilitate the handling of large-scale data, and to assist methodological experimentation. Reproducibility, and instantaneous repeatability, is built in to the ReproPhylo system and does not require user intervention or configuration because it stores the experimental workflow as a single, serialized Python object containing explicit provenance and environment information. This 'single file' approach ensures the persistence of provenance across iterations of the analysis, with changes automatically managed by the version control program Git. This file, along with a Git repository, are the primary reproducibility outputs of the program. In addition, ReproPhylo produces an extensive human-readable report and generates a comprehensive experimental archive file, both of which are suitable for submission with publications. The system facilitates thorough experimental exploration of both parameters and data. ReproPhylo is a platform independent CC0 Python module and is easily installed as a Docker image or a WinPython self-sufficient package, with a Jupyter Notebook GUI, or as a slimmer version in a Galaxy distribution.
format Article
id doaj-art-9e95582fd5a34eebb45a9cd752b63315
institution OA Journals
issn 1553-734X
1553-7358
language English
publishDate 2015-09-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj-art-9e95582fd5a34eebb45a9cd752b633152025-08-20T02:22:38ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582015-09-01119e100444710.1371/journal.pcbi.1004447ReproPhylo: An Environment for Reproducible Phylogenomics.Amir SzitenbergMax JohnMark L BlaxterDavid H LuntThe reproducibility of experiments is key to the scientific process, and particularly necessary for accurate reporting of analyses in data-rich fields such as phylogenomics. We present ReproPhylo, a phylogenomic analysis environment developed to ensure experimental reproducibility, to facilitate the handling of large-scale data, and to assist methodological experimentation. Reproducibility, and instantaneous repeatability, is built in to the ReproPhylo system and does not require user intervention or configuration because it stores the experimental workflow as a single, serialized Python object containing explicit provenance and environment information. This 'single file' approach ensures the persistence of provenance across iterations of the analysis, with changes automatically managed by the version control program Git. This file, along with a Git repository, are the primary reproducibility outputs of the program. In addition, ReproPhylo produces an extensive human-readable report and generates a comprehensive experimental archive file, both of which are suitable for submission with publications. The system facilitates thorough experimental exploration of both parameters and data. ReproPhylo is a platform independent CC0 Python module and is easily installed as a Docker image or a WinPython self-sufficient package, with a Jupyter Notebook GUI, or as a slimmer version in a Galaxy distribution.https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1004447&type=printable
spellingShingle Amir Szitenberg
Max John
Mark L Blaxter
David H Lunt
ReproPhylo: An Environment for Reproducible Phylogenomics.
PLoS Computational Biology
title ReproPhylo: An Environment for Reproducible Phylogenomics.
title_full ReproPhylo: An Environment for Reproducible Phylogenomics.
title_fullStr ReproPhylo: An Environment for Reproducible Phylogenomics.
title_full_unstemmed ReproPhylo: An Environment for Reproducible Phylogenomics.
title_short ReproPhylo: An Environment for Reproducible Phylogenomics.
title_sort reprophylo an environment for reproducible phylogenomics
url https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1004447&type=printable
work_keys_str_mv AT amirszitenberg reprophyloanenvironmentforreproduciblephylogenomics
AT maxjohn reprophyloanenvironmentforreproduciblephylogenomics
AT marklblaxter reprophyloanenvironmentforreproduciblephylogenomics
AT davidhlunt reprophyloanenvironmentforreproduciblephylogenomics