Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency
Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and a...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2015-01-01
|
Series: | International Journal of Genomics |
Online Access: | http://dx.doi.org/10.1155/2015/502795 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832565066591895552 |
---|---|
author | Rodrigo Aniceto Rene Xavier Valeria Guimarães Fernanda Hondo Maristela Holanda Maria Emilia Walter Sérgio Lifschitz |
author_facet | Rodrigo Aniceto Rene Xavier Valeria Guimarães Fernanda Hondo Maristela Holanda Maria Emilia Walter Sérgio Lifschitz |
author_sort | Rodrigo Aniceto |
collection | DOAJ |
description | Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB. |
format | Article |
id | doaj-art-dcefeb6fb3a941f98f80b7e852ab7458 |
institution | Kabale University |
issn | 2314-436X 2314-4378 |
language | English |
publishDate | 2015-01-01 |
publisher | Wiley |
record_format | Article |
series | International Journal of Genomics |
spelling | doaj-art-dcefeb6fb3a941f98f80b7e852ab74582025-02-03T01:09:35ZengWileyInternational Journal of Genomics2314-436X2314-43782015-01-01201510.1155/2015/502795502795Evaluating the Cassandra NoSQL Database Approach for Genomic Data PersistencyRodrigo Aniceto0Rene Xavier1Valeria Guimarães2Fernanda Hondo3Maristela Holanda4Maria Emilia Walter5Sérgio Lifschitz6Computer Science Department, University of Brasilia (UNB), 70910-900 Brasilia, DF, BrazilComputer Science Department, University of Brasilia (UNB), 70910-900 Brasilia, DF, BrazilComputer Science Department, University of Brasilia (UNB), 70910-900 Brasilia, DF, BrazilComputer Science Department, University of Brasilia (UNB), 70910-900 Brasilia, DF, BrazilComputer Science Department, University of Brasilia (UNB), 70910-900 Brasilia, DF, BrazilComputer Science Department, University of Brasilia (UNB), 70910-900 Brasilia, DF, BrazilInformatics Department, Pontifical Catholic University of Rio de Janeiro (PUC-Rio), 22451-900 Rio de Janeiro, RJ, BrazilRapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB.http://dx.doi.org/10.1155/2015/502795 |
spellingShingle | Rodrigo Aniceto Rene Xavier Valeria Guimarães Fernanda Hondo Maristela Holanda Maria Emilia Walter Sérgio Lifschitz Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency International Journal of Genomics |
title | Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency |
title_full | Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency |
title_fullStr | Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency |
title_full_unstemmed | Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency |
title_short | Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency |
title_sort | evaluating the cassandra nosql database approach for genomic data persistency |
url | http://dx.doi.org/10.1155/2015/502795 |
work_keys_str_mv | AT rodrigoaniceto evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency AT renexavier evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency AT valeriaguimaraes evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency AT fernandahondo evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency AT maristelaholanda evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency AT mariaemiliawalter evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency AT sergiolifschitz evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency |