An Efficient Genome Fragment Assembling Using GA with Neighborhood Aware Fitness Function

To decode a long genome sequence, shotgun sequencing is the state-of-the-art technique. It needs to properly sequence a very large number, sometimes as large as millions, of short partially readable strings (fragments). Arranging those fragments in correct sequence is known as fragment assembling, w...

Full description

Saved in:
Bibliographic Details
Main Authors: Satoko Kikuchi, Goutam Chakraborty
Format: Article
Language:English
Published: Wiley 2012-01-01
Series:Applied Computational Intelligence and Soft Computing
Online Access:http://dx.doi.org/10.1155/2012/945401
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832554690566422528
author Satoko Kikuchi
Goutam Chakraborty
author_facet Satoko Kikuchi
Goutam Chakraborty
author_sort Satoko Kikuchi
collection DOAJ
description To decode a long genome sequence, shotgun sequencing is the state-of-the-art technique. It needs to properly sequence a very large number, sometimes as large as millions, of short partially readable strings (fragments). Arranging those fragments in correct sequence is known as fragment assembling, which is an NP-problem. Presently used methods require enormous computational cost. In this work, we have shown how our modified genetic algorithm (GA) could solve this problem efficiently. In the proposed GA, the length of the chromosome, which represents the volume of the search space, is reduced with advancing generations, and thereby improves search efficiency. We also introduced a greedy mutation, by swapping nearby fragments using some heuristics, to improve the fitness of chromosomes. We compared results with Parsons’ algorithm which is based on GA too. We used fragments with partial reads on both sides, mimicking fragments in real genome assembling process. In Parsons’ work base-pair array of the whole fragment is known. Even then, we could obtain much better results, and we succeeded in restructuring contigs covering 100% of the genome sequences.
format Article
id doaj-art-20973baaf895492894812e609b3dece3
institution Kabale University
issn 1687-9724
1687-9732
language English
publishDate 2012-01-01
publisher Wiley
record_format Article
series Applied Computational Intelligence and Soft Computing
spelling doaj-art-20973baaf895492894812e609b3dece32025-02-03T05:50:51ZengWileyApplied Computational Intelligence and Soft Computing1687-97241687-97322012-01-01201210.1155/2012/945401945401An Efficient Genome Fragment Assembling Using GA with Neighborhood Aware Fitness FunctionSatoko Kikuchi0Goutam Chakraborty1Faculty of Software and Information Science, Iwate Prefectural University, Takizawa-mura, Iwate 020-0193, JapanFaculty of Software and Information Science, Iwate Prefectural University, Takizawa-mura, Iwate 020-0193, JapanTo decode a long genome sequence, shotgun sequencing is the state-of-the-art technique. It needs to properly sequence a very large number, sometimes as large as millions, of short partially readable strings (fragments). Arranging those fragments in correct sequence is known as fragment assembling, which is an NP-problem. Presently used methods require enormous computational cost. In this work, we have shown how our modified genetic algorithm (GA) could solve this problem efficiently. In the proposed GA, the length of the chromosome, which represents the volume of the search space, is reduced with advancing generations, and thereby improves search efficiency. We also introduced a greedy mutation, by swapping nearby fragments using some heuristics, to improve the fitness of chromosomes. We compared results with Parsons’ algorithm which is based on GA too. We used fragments with partial reads on both sides, mimicking fragments in real genome assembling process. In Parsons’ work base-pair array of the whole fragment is known. Even then, we could obtain much better results, and we succeeded in restructuring contigs covering 100% of the genome sequences.http://dx.doi.org/10.1155/2012/945401
spellingShingle Satoko Kikuchi
Goutam Chakraborty
An Efficient Genome Fragment Assembling Using GA with Neighborhood Aware Fitness Function
Applied Computational Intelligence and Soft Computing
title An Efficient Genome Fragment Assembling Using GA with Neighborhood Aware Fitness Function
title_full An Efficient Genome Fragment Assembling Using GA with Neighborhood Aware Fitness Function
title_fullStr An Efficient Genome Fragment Assembling Using GA with Neighborhood Aware Fitness Function
title_full_unstemmed An Efficient Genome Fragment Assembling Using GA with Neighborhood Aware Fitness Function
title_short An Efficient Genome Fragment Assembling Using GA with Neighborhood Aware Fitness Function
title_sort efficient genome fragment assembling using ga with neighborhood aware fitness function
url http://dx.doi.org/10.1155/2012/945401
work_keys_str_mv AT satokokikuchi anefficientgenomefragmentassemblingusinggawithneighborhoodawarefitnessfunction
AT goutamchakraborty anefficientgenomefragmentassemblingusinggawithneighborhoodawarefitnessfunction
AT satokokikuchi efficientgenomefragmentassemblingusinggawithneighborhoodawarefitnessfunction
AT goutamchakraborty efficientgenomefragmentassemblingusinggawithneighborhoodawarefitnessfunction