Functional Prediction of Hypothetical Proteins from and Validation of the Predicted Models by Using ROC Curve Analysis

Shigella spp. constitutes some of the key pathogens responsible for the global burden of diarrhoeal disease. With over 164 million reported cases per annum, shigellosis accounts for 1.1 million deaths each year. Majority of these cases occur among the children of the developing nations and the emerg...

Full description

Saved in:
Bibliographic Details
Main Authors: Md. Amran Gazi, Sultan Mahmud, Shah Mohammad Fahim, Mohammad Golam Kibria, Parag Palit, Md. Rezaul Islam, Humaira Rashid, Subhasish Das, Mustafa Mahfuz, Tahmeed Ahmeed
Format: Article
Language:English
Published: BioMed Central 2018-12-01
Series:Genomics & Informatics
Subjects:
Online Access:http://genominfo.org/upload/pdf/gi-2018-16-4-e26.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832573066345971712
author Md. Amran Gazi
Sultan Mahmud
Shah Mohammad Fahim
Mohammad Golam Kibria
Parag Palit
Md. Rezaul Islam
Humaira Rashid
Subhasish Das
Mustafa Mahfuz
Tahmeed Ahmeed
author_facet Md. Amran Gazi
Sultan Mahmud
Shah Mohammad Fahim
Mohammad Golam Kibria
Parag Palit
Md. Rezaul Islam
Humaira Rashid
Subhasish Das
Mustafa Mahfuz
Tahmeed Ahmeed
author_sort Md. Amran Gazi
collection DOAJ
description Shigella spp. constitutes some of the key pathogens responsible for the global burden of diarrhoeal disease. With over 164 million reported cases per annum, shigellosis accounts for 1.1 million deaths each year. Majority of these cases occur among the children of the developing nations and the emergence of multi-drug resistance Shigella strains in clinical isolates demands the development of better/new drugs against this pathogen. The genome of Shigella flexneri was extensively analyzed and found 4,362 proteins among which the functions of 674 proteins, termed as hypothetical proteins (HPs) had not been previously elucidated. Amino acid sequences of all these 674 HPs were studied and the functions of a total of 39 HPs have been assigned with high level of confidence. Here we have utilized a combination of the latest versions of databases to assign the precise function of HPs for which no experimental information is available. These HPs were found to belong to various classes of proteins such as enzymes, binding proteins, signal transducers, lipoprotein, transporters, virulence and other proteins. Evaluation of the performance of the various computational tools conducted using receiver operating characteristic curve analysis and a resoundingly high average accuracy of 93.6% were obtained. Our comprehensive analysis will help to gain greater understanding for the development of many novel potential therapeutic interventions to defeat Shigella infection.
format Article
id doaj-art-31f6a90f0215428b87eee9bf157cef7b
institution Kabale University
issn 2234-0742
language English
publishDate 2018-12-01
publisher BioMed Central
record_format Article
series Genomics & Informatics
spelling doaj-art-31f6a90f0215428b87eee9bf157cef7b2025-02-02T05:29:04ZengBioMed CentralGenomics & Informatics2234-07422018-12-0116410.5808/GI.2018.16.4.e26526Functional Prediction of Hypothetical Proteins from and Validation of the Predicted Models by Using ROC Curve AnalysisMd. Amran Gazi0Sultan Mahmud1Shah Mohammad Fahim2Mohammad Golam Kibria3Parag Palit4Md. Rezaul Islam5Humaira Rashid6Subhasish Das7Mustafa Mahfuz8Tahmeed Ahmeed9 Nutrition and Clinical Services Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka 1212, Bangladesh Infectious Diseases Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka 1212, Bangladesh Nutrition and Clinical Services Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka 1212, Bangladesh Infectious Diseases Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka 1212, Bangladesh Nutrition and Clinical Services Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka 1212, Bangladesh International Max Planck Research School, Grisebachstraße 5, 37077 Göttingen, Germany Infectious Diseases Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka 1212, Bangladesh Nutrition and Clinical Services Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka 1212, Bangladesh Nutrition and Clinical Services Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka 1212, Bangladesh Nutrition and Clinical Services Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka 1212, BangladeshShigella spp. constitutes some of the key pathogens responsible for the global burden of diarrhoeal disease. With over 164 million reported cases per annum, shigellosis accounts for 1.1 million deaths each year. Majority of these cases occur among the children of the developing nations and the emergence of multi-drug resistance Shigella strains in clinical isolates demands the development of better/new drugs against this pathogen. The genome of Shigella flexneri was extensively analyzed and found 4,362 proteins among which the functions of 674 proteins, termed as hypothetical proteins (HPs) had not been previously elucidated. Amino acid sequences of all these 674 HPs were studied and the functions of a total of 39 HPs have been assigned with high level of confidence. Here we have utilized a combination of the latest versions of databases to assign the precise function of HPs for which no experimental information is available. These HPs were found to belong to various classes of proteins such as enzymes, binding proteins, signal transducers, lipoprotein, transporters, virulence and other proteins. Evaluation of the performance of the various computational tools conducted using receiver operating characteristic curve analysis and a resoundingly high average accuracy of 93.6% were obtained. Our comprehensive analysis will help to gain greater understanding for the development of many novel potential therapeutic interventions to defeat Shigella infection.http://genominfo.org/upload/pdf/gi-2018-16-4-e26.pdfhypothetical protein NCBIROC curve
spellingShingle Md. Amran Gazi
Sultan Mahmud
Shah Mohammad Fahim
Mohammad Golam Kibria
Parag Palit
Md. Rezaul Islam
Humaira Rashid
Subhasish Das
Mustafa Mahfuz
Tahmeed Ahmeed
Functional Prediction of Hypothetical Proteins from and Validation of the Predicted Models by Using ROC Curve Analysis
Genomics & Informatics
hypothetical protein

NCBI
ROC curve

title Functional Prediction of Hypothetical Proteins from and Validation of the Predicted Models by Using ROC Curve Analysis
title_full Functional Prediction of Hypothetical Proteins from and Validation of the Predicted Models by Using ROC Curve Analysis
title_fullStr Functional Prediction of Hypothetical Proteins from and Validation of the Predicted Models by Using ROC Curve Analysis
title_full_unstemmed Functional Prediction of Hypothetical Proteins from and Validation of the Predicted Models by Using ROC Curve Analysis
title_short Functional Prediction of Hypothetical Proteins from and Validation of the Predicted Models by Using ROC Curve Analysis
title_sort functional prediction of hypothetical proteins from and validation of the predicted models by using roc curve analysis
topic hypothetical protein

NCBI
ROC curve

url http://genominfo.org/upload/pdf/gi-2018-16-4-e26.pdf
work_keys_str_mv AT mdamrangazi functionalpredictionofhypotheticalproteinsfromandvalidationofthepredictedmodelsbyusingroccurveanalysis
AT sultanmahmud functionalpredictionofhypotheticalproteinsfromandvalidationofthepredictedmodelsbyusingroccurveanalysis
AT shahmohammadfahim functionalpredictionofhypotheticalproteinsfromandvalidationofthepredictedmodelsbyusingroccurveanalysis
AT mohammadgolamkibria functionalpredictionofhypotheticalproteinsfromandvalidationofthepredictedmodelsbyusingroccurveanalysis
AT paragpalit functionalpredictionofhypotheticalproteinsfromandvalidationofthepredictedmodelsbyusingroccurveanalysis
AT mdrezaulislam functionalpredictionofhypotheticalproteinsfromandvalidationofthepredictedmodelsbyusingroccurveanalysis
AT humairarashid functionalpredictionofhypotheticalproteinsfromandvalidationofthepredictedmodelsbyusingroccurveanalysis
AT subhasishdas functionalpredictionofhypotheticalproteinsfromandvalidationofthepredictedmodelsbyusingroccurveanalysis
AT mustafamahfuz functionalpredictionofhypotheticalproteinsfromandvalidationofthepredictedmodelsbyusingroccurveanalysis
AT tahmeedahmeed functionalpredictionofhypotheticalproteinsfromandvalidationofthepredictedmodelsbyusingroccurveanalysis