MacSyFinder v2: Improved modelling and search engine to identify molecular systems in genomes

Complex cellular functions are usually encoded by a set of genes in one or a few organized genetic loci in microbial genomes. Macromolecular System Finder (MacSyFinder) is a program that uses these properties to model and then annotate cellular functions in microbial genomes. This is done by integra...

Full description

Saved in:
Bibliographic Details
Main Authors: Néron, Bertrand, Denise, Rémi, Coluzzi, Charles, Touchon, Marie, Rocha, Eduardo P.C., Abby, Sophie S.
Format: Article
Language:English
Published: Peer Community In 2023-03-01
Series:Peer Community Journal
Online Access:https://peercommunityjournal.org/articles/10.24072/pcjournal.250/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1825206382976040960
author Néron, Bertrand
Denise, Rémi
Coluzzi, Charles
Touchon, Marie
Rocha, Eduardo P.C.
Abby, Sophie S.
author_facet Néron, Bertrand
Denise, Rémi
Coluzzi, Charles
Touchon, Marie
Rocha, Eduardo P.C.
Abby, Sophie S.
author_sort Néron, Bertrand
collection DOAJ
description Complex cellular functions are usually encoded by a set of genes in one or a few organized genetic loci in microbial genomes. Macromolecular System Finder (MacSyFinder) is a program that uses these properties to model and then annotate cellular functions in microbial genomes. This is done by integrating the identification of each individual gene at the level of the molecular system. We hereby present a major release of MacSyFinder (version 2) coded in Python 3. The code was improved and rationalized to facilitate future maintainability. Several new features were added to allow more flexible modelling of the systems. We introduce a more intuitive and comprehensive search engine to identify all the best candidate systems and sub-optimal ones that respect the models’ constraints. We also introduce the novel macsydata companion tool that enables the easy installation and broad distribution of the models developed for MacSyFinder (macsy-models) from GitHub repositories. Finally, we have updated and improved MacSyFinder popular models: TXSScan to identify protein secretion systems, TFFscan to identify type IV filaments, CONJscan to identify conjugative systems, and CasFinder to identify CRISPR associated proteins. MacSyFinder and the updated models are available at: https://github.com/gem-pasteur/macsyfinder and https://github.com/macsy-models.
format Article
id doaj-art-2ebc353f03c54ff8997e747060c25dbd
institution Kabale University
issn 2804-3871
language English
publishDate 2023-03-01
publisher Peer Community In
record_format Article
series Peer Community Journal
spelling doaj-art-2ebc353f03c54ff8997e747060c25dbd2025-02-07T10:16:49ZengPeer Community InPeer Community Journal2804-38712023-03-01310.24072/pcjournal.25010.24072/pcjournal.250MacSyFinder v2: Improved modelling and search engine to identify molecular systems in genomes Néron, Bertrand0https://orcid.org/0000-0002-0220-0482Denise, Rémi1https://orcid.org/0000-0003-2277-689XColuzzi, Charles2https://orcid.org/0000-0003-2238-0836Touchon, Marie3https://orcid.org/0000-0001-7389-447XRocha, Eduardo P.C.4https://orcid.org/0000-0001-7704-822XAbby, Sophie S.5https://orcid.org/0000-0002-5231-3346Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics HUB, Paris, FranceInstitut Pasteur, Université Paris Cité, CNRS UMR3525, Microbial Evolutionary Genomics, Paris, France; APC Microbiome Ireland & School of Microbiology, University College Cork, Cork, IrelandInstitut Pasteur, Université Paris Cité, CNRS UMR3525, Microbial Evolutionary Genomics, Paris, FranceInstitut Pasteur, Université Paris Cité, CNRS UMR3525, Microbial Evolutionary Genomics, Paris, FranceInstitut Pasteur, Université Paris Cité, CNRS UMR3525, Microbial Evolutionary Genomics, Paris, FranceUniv. Grenoble Alpes, CNRS, UMR 5525, VetAgro Sup, Grenoble INP, TIMC, 38000 Grenoble, FranceComplex cellular functions are usually encoded by a set of genes in one or a few organized genetic loci in microbial genomes. Macromolecular System Finder (MacSyFinder) is a program that uses these properties to model and then annotate cellular functions in microbial genomes. This is done by integrating the identification of each individual gene at the level of the molecular system. We hereby present a major release of MacSyFinder (version 2) coded in Python 3. The code was improved and rationalized to facilitate future maintainability. Several new features were added to allow more flexible modelling of the systems. We introduce a more intuitive and comprehensive search engine to identify all the best candidate systems and sub-optimal ones that respect the models’ constraints. We also introduce the novel macsydata companion tool that enables the easy installation and broad distribution of the models developed for MacSyFinder (macsy-models) from GitHub repositories. Finally, we have updated and improved MacSyFinder popular models: TXSScan to identify protein secretion systems, TFFscan to identify type IV filaments, CONJscan to identify conjugative systems, and CasFinder to identify CRISPR associated proteins. MacSyFinder and the updated models are available at: https://github.com/gem-pasteur/macsyfinder and https://github.com/macsy-models. https://peercommunityjournal.org/articles/10.24072/pcjournal.250/
spellingShingle Néron, Bertrand
Denise, Rémi
Coluzzi, Charles
Touchon, Marie
Rocha, Eduardo P.C.
Abby, Sophie S.
MacSyFinder v2: Improved modelling and search engine to identify molecular systems in genomes
Peer Community Journal
title MacSyFinder v2: Improved modelling and search engine to identify molecular systems in genomes
title_full MacSyFinder v2: Improved modelling and search engine to identify molecular systems in genomes
title_fullStr MacSyFinder v2: Improved modelling and search engine to identify molecular systems in genomes
title_full_unstemmed MacSyFinder v2: Improved modelling and search engine to identify molecular systems in genomes
title_short MacSyFinder v2: Improved modelling and search engine to identify molecular systems in genomes
title_sort macsyfinder v2 improved modelling and search engine to identify molecular systems in genomes
url https://peercommunityjournal.org/articles/10.24072/pcjournal.250/
work_keys_str_mv AT neronbertrand macsyfinderv2improvedmodellingandsearchenginetoidentifymolecularsystemsingenomes
AT deniseremi macsyfinderv2improvedmodellingandsearchenginetoidentifymolecularsystemsingenomes
AT coluzzicharles macsyfinderv2improvedmodellingandsearchenginetoidentifymolecularsystemsingenomes
AT touchonmarie macsyfinderv2improvedmodellingandsearchenginetoidentifymolecularsystemsingenomes
AT rochaeduardopc macsyfinderv2improvedmodellingandsearchenginetoidentifymolecularsystemsingenomes
AT abbysophies macsyfinderv2improvedmodellingandsearchenginetoidentifymolecularsystemsingenomes