BTFBS: Binding Prediction of Bacterial Transcription Factors and Binding Sites Based on Deep Learning

The binding of transcription factors (TFs) to TF binding sites plays a vital role in the process of regulating gene expression and evolution. With the development of machine learning and deep learning, some successes have been achieved in predicting transcription factors and binding sites. In this p...

Full description

Saved in:
Bibliographic Details
Main Authors: Bingbing Jin, Song Liang, Xiaoqian Liu, Rui Zhang, Yun Zhu, Yuanyuan Chen, Guangjin Liu, Tao Yang
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/13/4/589
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850231526049972224
author Bingbing Jin
Song Liang
Xiaoqian Liu
Rui Zhang
Yun Zhu
Yuanyuan Chen
Guangjin Liu
Tao Yang
author_facet Bingbing Jin
Song Liang
Xiaoqian Liu
Rui Zhang
Yun Zhu
Yuanyuan Chen
Guangjin Liu
Tao Yang
author_sort Bingbing Jin
collection DOAJ
description The binding of transcription factors (TFs) to TF binding sites plays a vital role in the process of regulating gene expression and evolution. With the development of machine learning and deep learning, some successes have been achieved in predicting transcription factors and binding sites. In this paper, we develop a model, BTFBS, which predicts whether the bacterial transcription factors and binding sites combine or not. The model takes both the amino acid sequences of bacterial transcription factors and the nucleotide sequences of binding sites as inputs, and extracts features through convolutional neural network and MultiheadAttention. For the model inputs, we use two negative sample sampling methods: RS and EE. On the test dataset of RS, the accuracy, sensitivity, specificity, F1-score, and MCC of BTFBS are 0.91446, 0.89746, 0.93134, 0.91264, and 0.82946, respectively. Furthermore, on the test dataset of EE, the accuracy, sensitivity, specificity, F1-score and MCC of BTFBS are 0.87868, 0.89354, 0.86394, 0.87996, and 0.75796, respectively. Meanwhile, our findings indicate that the optimal approach for obtaining negative samples in the context of bacterial research is to utilize the whole genome sequences of the corresponding bacteria, as opposed to the shuffling method. The above results on the test dataset have shown that the proposed BTFBS model has a good performance and it can provide an experimental guide.
format Article
id doaj-art-10df92e40a2d426e994c2ceed71b3e28
institution OA Journals
issn 2227-7390
language English
publishDate 2025-02-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj-art-10df92e40a2d426e994c2ceed71b3e282025-08-20T02:03:31ZengMDPI AGMathematics2227-73902025-02-0113458910.3390/math13040589BTFBS: Binding Prediction of Bacterial Transcription Factors and Binding Sites Based on Deep LearningBingbing Jin0Song Liang1Xiaoqian Liu2Rui Zhang3Yun Zhu4Yuanyuan Chen5Guangjin Liu6Tao Yang7College of Sciences, Nanjing Agricultural University, Nanjing 210095, ChinaCollege of Veterinary Medicine, Nanjing Agricultural University, Nanjing 210095, ChinaCollege of Sciences, Nanjing Agricultural University, Nanjing 210095, ChinaCollege of Sciences, Nanjing Agricultural University, Nanjing 210095, ChinaCollege of Sciences, Nanjing Agricultural University, Nanjing 210095, ChinaCollege of Sciences, Nanjing Agricultural University, Nanjing 210095, ChinaCollege of Veterinary Medicine, Nanjing Agricultural University, Nanjing 210095, ChinaCollege of Sciences, Nanjing Agricultural University, Nanjing 210095, ChinaThe binding of transcription factors (TFs) to TF binding sites plays a vital role in the process of regulating gene expression and evolution. With the development of machine learning and deep learning, some successes have been achieved in predicting transcription factors and binding sites. In this paper, we develop a model, BTFBS, which predicts whether the bacterial transcription factors and binding sites combine or not. The model takes both the amino acid sequences of bacterial transcription factors and the nucleotide sequences of binding sites as inputs, and extracts features through convolutional neural network and MultiheadAttention. For the model inputs, we use two negative sample sampling methods: RS and EE. On the test dataset of RS, the accuracy, sensitivity, specificity, F1-score, and MCC of BTFBS are 0.91446, 0.89746, 0.93134, 0.91264, and 0.82946, respectively. Furthermore, on the test dataset of EE, the accuracy, sensitivity, specificity, F1-score and MCC of BTFBS are 0.87868, 0.89354, 0.86394, 0.87996, and 0.75796, respectively. Meanwhile, our findings indicate that the optimal approach for obtaining negative samples in the context of bacterial research is to utilize the whole genome sequences of the corresponding bacteria, as opposed to the shuffling method. The above results on the test dataset have shown that the proposed BTFBS model has a good performance and it can provide an experimental guide.https://www.mdpi.com/2227-7390/13/4/589transcription factorsbinding sitesdeep learningbacteria
spellingShingle Bingbing Jin
Song Liang
Xiaoqian Liu
Rui Zhang
Yun Zhu
Yuanyuan Chen
Guangjin Liu
Tao Yang
BTFBS: Binding Prediction of Bacterial Transcription Factors and Binding Sites Based on Deep Learning
Mathematics
transcription factors
binding sites
deep learning
bacteria
title BTFBS: Binding Prediction of Bacterial Transcription Factors and Binding Sites Based on Deep Learning
title_full BTFBS: Binding Prediction of Bacterial Transcription Factors and Binding Sites Based on Deep Learning
title_fullStr BTFBS: Binding Prediction of Bacterial Transcription Factors and Binding Sites Based on Deep Learning
title_full_unstemmed BTFBS: Binding Prediction of Bacterial Transcription Factors and Binding Sites Based on Deep Learning
title_short BTFBS: Binding Prediction of Bacterial Transcription Factors and Binding Sites Based on Deep Learning
title_sort btfbs binding prediction of bacterial transcription factors and binding sites based on deep learning
topic transcription factors
binding sites
deep learning
bacteria
url https://www.mdpi.com/2227-7390/13/4/589
work_keys_str_mv AT bingbingjin btfbsbindingpredictionofbacterialtranscriptionfactorsandbindingsitesbasedondeeplearning
AT songliang btfbsbindingpredictionofbacterialtranscriptionfactorsandbindingsitesbasedondeeplearning
AT xiaoqianliu btfbsbindingpredictionofbacterialtranscriptionfactorsandbindingsitesbasedondeeplearning
AT ruizhang btfbsbindingpredictionofbacterialtranscriptionfactorsandbindingsitesbasedondeeplearning
AT yunzhu btfbsbindingpredictionofbacterialtranscriptionfactorsandbindingsitesbasedondeeplearning
AT yuanyuanchen btfbsbindingpredictionofbacterialtranscriptionfactorsandbindingsitesbasedondeeplearning
AT guangjinliu btfbsbindingpredictionofbacterialtranscriptionfactorsandbindingsitesbasedondeeplearning
AT taoyang btfbsbindingpredictionofbacterialtranscriptionfactorsandbindingsitesbasedondeeplearning