Improved annotation of 3' untranslated regions and complex loci by combination of strand-specific direct RNA sequencing, RNA-Seq and ESTs.

The reference annotations made for a genome sequence provide the framework for all subsequent analyses of the genome. Correct and complete annotation in addition to the underlying genomic sequence is particularly important when interpreting the results of RNA-seq experiments where short sequence rea...

Full description

Saved in:
Bibliographic Details
Main Authors: Nicholas J Schurch, Christian Cole, Alexander Sherstnev, Junfang Song, Céline Duc, Kate G Storey, W H Irwin McLean, Sara J Brown, Gordon G Simpson, Geoffrey J Barton
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2014-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0094270
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850212770814885888
author Nicholas J Schurch
Christian Cole
Alexander Sherstnev
Junfang Song
Céline Duc
Kate G Storey
W H Irwin McLean
Sara J Brown
Gordon G Simpson
Geoffrey J Barton
author_facet Nicholas J Schurch
Christian Cole
Alexander Sherstnev
Junfang Song
Céline Duc
Kate G Storey
W H Irwin McLean
Sara J Brown
Gordon G Simpson
Geoffrey J Barton
author_sort Nicholas J Schurch
collection DOAJ
description The reference annotations made for a genome sequence provide the framework for all subsequent analyses of the genome. Correct and complete annotation in addition to the underlying genomic sequence is particularly important when interpreting the results of RNA-seq experiments where short sequence reads are mapped against the genome and assigned to genes according to the annotation. Inconsistencies in annotations between the reference and the experimental system can lead to incorrect interpretation of the effect on RNA expression of an experimental treatment or mutation in the system under study. Until recently, the genome-wide annotation of 3' untranslated regions received less attention than coding regions and the delineation of intron/exon boundaries. In this paper, data produced for samples in Human, Chicken and A. thaliana by the novel single-molecule, strand-specific, Direct RNA Sequencing technology from Helicos Biosciences which locates 3' polyadenylation sites to within +/- 2 nt, were combined with archival EST and RNA-Seq data. Nine examples are illustrated where this combination of data allowed: (1) gene and 3' UTR re-annotation (including extension of one 3' UTR by 5.9 kb); (2) disentangling of gene expression in complex regions; (3) clearer interpretation of small RNA expression and (4) identification of novel genes. While the specific examples displayed here may become obsolete as genome sequences and their annotations are refined, the principles laid out in this paper will be of general use both to those annotating genomes and those seeking to interpret existing publically available annotations in the context of their own experimental data.
format Article
id doaj-art-34addfaf11e64ceba92df6606549372e
institution OA Journals
issn 1932-6203
language English
publishDate 2014-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-34addfaf11e64ceba92df6606549372e2025-08-20T02:09:15ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0194e9427010.1371/journal.pone.0094270Improved annotation of 3' untranslated regions and complex loci by combination of strand-specific direct RNA sequencing, RNA-Seq and ESTs.Nicholas J SchurchChristian ColeAlexander SherstnevJunfang SongCéline DucKate G StoreyW H Irwin McLeanSara J BrownGordon G SimpsonGeoffrey J BartonThe reference annotations made for a genome sequence provide the framework for all subsequent analyses of the genome. Correct and complete annotation in addition to the underlying genomic sequence is particularly important when interpreting the results of RNA-seq experiments where short sequence reads are mapped against the genome and assigned to genes according to the annotation. Inconsistencies in annotations between the reference and the experimental system can lead to incorrect interpretation of the effect on RNA expression of an experimental treatment or mutation in the system under study. Until recently, the genome-wide annotation of 3' untranslated regions received less attention than coding regions and the delineation of intron/exon boundaries. In this paper, data produced for samples in Human, Chicken and A. thaliana by the novel single-molecule, strand-specific, Direct RNA Sequencing technology from Helicos Biosciences which locates 3' polyadenylation sites to within +/- 2 nt, were combined with archival EST and RNA-Seq data. Nine examples are illustrated where this combination of data allowed: (1) gene and 3' UTR re-annotation (including extension of one 3' UTR by 5.9 kb); (2) disentangling of gene expression in complex regions; (3) clearer interpretation of small RNA expression and (4) identification of novel genes. While the specific examples displayed here may become obsolete as genome sequences and their annotations are refined, the principles laid out in this paper will be of general use both to those annotating genomes and those seeking to interpret existing publically available annotations in the context of their own experimental data.https://doi.org/10.1371/journal.pone.0094270
spellingShingle Nicholas J Schurch
Christian Cole
Alexander Sherstnev
Junfang Song
Céline Duc
Kate G Storey
W H Irwin McLean
Sara J Brown
Gordon G Simpson
Geoffrey J Barton
Improved annotation of 3' untranslated regions and complex loci by combination of strand-specific direct RNA sequencing, RNA-Seq and ESTs.
PLoS ONE
title Improved annotation of 3' untranslated regions and complex loci by combination of strand-specific direct RNA sequencing, RNA-Seq and ESTs.
title_full Improved annotation of 3' untranslated regions and complex loci by combination of strand-specific direct RNA sequencing, RNA-Seq and ESTs.
title_fullStr Improved annotation of 3' untranslated regions and complex loci by combination of strand-specific direct RNA sequencing, RNA-Seq and ESTs.
title_full_unstemmed Improved annotation of 3' untranslated regions and complex loci by combination of strand-specific direct RNA sequencing, RNA-Seq and ESTs.
title_short Improved annotation of 3' untranslated regions and complex loci by combination of strand-specific direct RNA sequencing, RNA-Seq and ESTs.
title_sort improved annotation of 3 untranslated regions and complex loci by combination of strand specific direct rna sequencing rna seq and ests
url https://doi.org/10.1371/journal.pone.0094270
work_keys_str_mv AT nicholasjschurch improvedannotationof3untranslatedregionsandcomplexlocibycombinationofstrandspecificdirectrnasequencingrnaseqandests
AT christiancole improvedannotationof3untranslatedregionsandcomplexlocibycombinationofstrandspecificdirectrnasequencingrnaseqandests
AT alexandersherstnev improvedannotationof3untranslatedregionsandcomplexlocibycombinationofstrandspecificdirectrnasequencingrnaseqandests
AT junfangsong improvedannotationof3untranslatedregionsandcomplexlocibycombinationofstrandspecificdirectrnasequencingrnaseqandests
AT celineduc improvedannotationof3untranslatedregionsandcomplexlocibycombinationofstrandspecificdirectrnasequencingrnaseqandests
AT kategstorey improvedannotationof3untranslatedregionsandcomplexlocibycombinationofstrandspecificdirectrnasequencingrnaseqandests
AT whirwinmclean improvedannotationof3untranslatedregionsandcomplexlocibycombinationofstrandspecificdirectrnasequencingrnaseqandests
AT sarajbrown improvedannotationof3untranslatedregionsandcomplexlocibycombinationofstrandspecificdirectrnasequencingrnaseqandests
AT gordongsimpson improvedannotationof3untranslatedregionsandcomplexlocibycombinationofstrandspecificdirectrnasequencingrnaseqandests
AT geoffreyjbarton improvedannotationof3untranslatedregionsandcomplexlocibycombinationofstrandspecificdirectrnasequencingrnaseqandests