Computer analysis of co-localization of transcription factor binding sites in genome by ChIP-seq data

Statistical features of the distribution of transcription factor binding sites in the mouse genome that are obtained by ChIP-seq experiments in embryonic stem cells have been considered. Clusters of sites that contain four or more different transcription factor binding sites in the mouse genome have...

Full description

Saved in:
Bibliographic Details
Main Authors: A. I. Dergilev, A. M. Spitsina, I. V. Chadaeva, A. V. Svichkarev, F. M. Naumenko, E. V. Kulakova, E. R. Galieva, E. E. Vityaev, M. Chen, Y. L. Orlov
Format: Article
Language:English
Published: Siberian Branch of the Russian Academy of Sciences, Federal Research Center Institute of Cytology and Genetics, The Vavilov Society of Geneticists and Breeders 2017-02-01
Series:Вавиловский журнал генетики и селекции
Subjects:
Online Access:https://vavilov.elpub.ru/jour/article/view/850
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832575254892904448
author A. I. Dergilev
A. M. Spitsina
I. V. Chadaeva
A. V. Svichkarev
F. M. Naumenko
E. V. Kulakova
E. R. Galieva
E. E. Vityaev
M. Chen
Y. L. Orlov
author_facet A. I. Dergilev
A. M. Spitsina
I. V. Chadaeva
A. V. Svichkarev
F. M. Naumenko
E. V. Kulakova
E. R. Galieva
E. E. Vityaev
M. Chen
Y. L. Orlov
author_sort A. I. Dergilev
collection DOAJ
description Statistical features of the distribution of transcription factor binding sites in the mouse genome that are obtained by ChIP-seq experiments in embryonic stem cells have been considered. Clusters of sites that contain four or more different transcription factor binding sites in the mouse genome have been defined, also their location relatively to the regulatory regions of genes has been described. The presence of two types of site co-localization has been shown: clusters containing binding sites for factors Oct4, Nanog, Sox2, located in the distal regions, and clusters containing binding sites n-Myc, c-Myc, mainly located in the promoter regions of mouse genes. Analysis of new ChIPseq data about binding of transcription factors Nr5a2, Tbx3 in the same cell type has confirmed the division of clusters of transcription factors binding sites into two types: those containing the binding sites of regulators of pluripotency (Oct4, Nanog, and others) and those not. The computer program of the statistical data processing of gene location and chromatin domains that analyzes experimental data of site localization obtained by ChIP-seq in the mouse genome and the human genome has been developed. The presence of preferences at position of transcription factor binding sites of various types has been revealed, the distances between the nearest groups of TF binding sites Oct4, Nanog, Sox2 and TF binding sites n-Myc and c-Myc have been calculated using this program. The presence of nucleotide motifs of transcription factor binding sites in the selected areas of ChIP-seq has been estimated, nucleotide motifs have been refined. A correlation between the presence of motifs and the intensity of ChIPseq binding has been shown. Computer methods for estimating the clustering of different transcription factors binding sites for new data ChIP-seq have been developed. Programs are available upon the request to the authors.
format Article
id doaj-art-c2385c4a41674aa3ad820eb415c43dcf
institution Kabale University
issn 2500-3259
language English
publishDate 2017-02-01
publisher Siberian Branch of the Russian Academy of Sciences, Federal Research Center Institute of Cytology and Genetics, The Vavilov Society of Geneticists and Breeders
record_format Article
series Вавиловский журнал генетики и селекции
spelling doaj-art-c2385c4a41674aa3ad820eb415c43dcf2025-02-01T09:58:03ZengSiberian Branch of the Russian Academy of Sciences, Federal Research Center Institute of Cytology and Genetics, The Vavilov Society of Geneticists and BreedersВавиловский журнал генетики и селекции2500-32592017-02-0120677077810.18699/VJ16.194540Computer analysis of co-localization of transcription factor binding sites in genome by ChIP-seq dataA. I. Dergilev0A. M. Spitsina1I. V. Chadaeva2A. V. Svichkarev3F. M. Naumenko4E. V. Kulakova5E. R. Galieva6E. E. Vityaev7M. Chen8Y. L. Orlov9Novosibirsk State University Institute of Cytology and Genetics SB RASNovosibirsk State UniversityNovosibirsk State University Institute of Cytology and Genetics SB RASNovosibirsk State University Peter the Great St. Petersburg Polytechnic UniversityNovosibirsk State UniversityNovosibirsk State UniversityNovosibirsk State University Institute of Cytology and Genetics SB RASInstitute of Cytology and Genetics SB RAS Institute of Mathematics SB RASZhejiang UniversityNovosibirsk State University Institute of Cytology and Genetics SB RASStatistical features of the distribution of transcription factor binding sites in the mouse genome that are obtained by ChIP-seq experiments in embryonic stem cells have been considered. Clusters of sites that contain four or more different transcription factor binding sites in the mouse genome have been defined, also their location relatively to the regulatory regions of genes has been described. The presence of two types of site co-localization has been shown: clusters containing binding sites for factors Oct4, Nanog, Sox2, located in the distal regions, and clusters containing binding sites n-Myc, c-Myc, mainly located in the promoter regions of mouse genes. Analysis of new ChIPseq data about binding of transcription factors Nr5a2, Tbx3 in the same cell type has confirmed the division of clusters of transcription factors binding sites into two types: those containing the binding sites of regulators of pluripotency (Oct4, Nanog, and others) and those not. The computer program of the statistical data processing of gene location and chromatin domains that analyzes experimental data of site localization obtained by ChIP-seq in the mouse genome and the human genome has been developed. The presence of preferences at position of transcription factor binding sites of various types has been revealed, the distances between the nearest groups of TF binding sites Oct4, Nanog, Sox2 and TF binding sites n-Myc and c-Myc have been calculated using this program. The presence of nucleotide motifs of transcription factor binding sites in the selected areas of ChIP-seq has been estimated, nucleotide motifs have been refined. A correlation between the presence of motifs and the intensity of ChIPseq binding has been shown. Computer methods for estimating the clustering of different transcription factors binding sites for new data ChIP-seq have been developed. Programs are available upon the request to the authors.https://vavilov.elpub.ru/jour/article/view/850transcription factor binding sitesembryonic stem cellsdata miningregularity discoverychip-seqenhancers.
spellingShingle A. I. Dergilev
A. M. Spitsina
I. V. Chadaeva
A. V. Svichkarev
F. M. Naumenko
E. V. Kulakova
E. R. Galieva
E. E. Vityaev
M. Chen
Y. L. Orlov
Computer analysis of co-localization of transcription factor binding sites in genome by ChIP-seq data
Вавиловский журнал генетики и селекции
transcription factor binding sites
embryonic stem cells
data mining
regularity discovery
chip-seq
enhancers.
title Computer analysis of co-localization of transcription factor binding sites in genome by ChIP-seq data
title_full Computer analysis of co-localization of transcription factor binding sites in genome by ChIP-seq data
title_fullStr Computer analysis of co-localization of transcription factor binding sites in genome by ChIP-seq data
title_full_unstemmed Computer analysis of co-localization of transcription factor binding sites in genome by ChIP-seq data
title_short Computer analysis of co-localization of transcription factor binding sites in genome by ChIP-seq data
title_sort computer analysis of co localization of transcription factor binding sites in genome by chip seq data
topic transcription factor binding sites
embryonic stem cells
data mining
regularity discovery
chip-seq
enhancers.
url https://vavilov.elpub.ru/jour/article/view/850
work_keys_str_mv AT aidergilev computeranalysisofcolocalizationoftranscriptionfactorbindingsitesingenomebychipseqdata
AT amspitsina computeranalysisofcolocalizationoftranscriptionfactorbindingsitesingenomebychipseqdata
AT ivchadaeva computeranalysisofcolocalizationoftranscriptionfactorbindingsitesingenomebychipseqdata
AT avsvichkarev computeranalysisofcolocalizationoftranscriptionfactorbindingsitesingenomebychipseqdata
AT fmnaumenko computeranalysisofcolocalizationoftranscriptionfactorbindingsitesingenomebychipseqdata
AT evkulakova computeranalysisofcolocalizationoftranscriptionfactorbindingsitesingenomebychipseqdata
AT ergalieva computeranalysisofcolocalizationoftranscriptionfactorbindingsitesingenomebychipseqdata
AT eevityaev computeranalysisofcolocalizationoftranscriptionfactorbindingsitesingenomebychipseqdata
AT mchen computeranalysisofcolocalizationoftranscriptionfactorbindingsitesingenomebychipseqdata
AT ylorlov computeranalysisofcolocalizationoftranscriptionfactorbindingsitesingenomebychipseqdata