Menconi, Giulia and Marangoni, Roberto (2005) A compression-based approach for coding sequences identification in prokaryotic genomes. Technical Report del Dipartimento di Informatica . Università di Pisa, Pisa, IT.
PDF (GZip) - Published Version Available under License Creative Commons Attribution No Derivatives. Download (163Kb) |
Abstract
To identify coding regions in genomic sequences represents the first step toward further analysis of the biological function carried on by the different functional elements in a genome. The present paper presents a novel method for the classification of coding and non-coding regions in prokaryotic genomes, based on a suitable defined compression index of a DNA sequence. The proposed approach has been applied on some prokaryotic complete genomes, obtaining optimal scores of correctly recognized coding and non-coding regions. Several false-positive and false-negative cases have been investigated in detail, discovering that this approach can fail in the presence of highly-structured coding regions (e.g., genes coding for modular proteins) or quasi-random non-coding regions (regions hosting non-functional fragments of copies of functional genes; regions hosting promoters or other protein-binding sequences, etc.).
Item Type: | Book |
---|---|
Uncontrolled Keywords: | Compression, Coding regions, Genes, Prokaryote. |
Subjects: | Area01 - Scienze matematiche e informatiche > INF/01 - Informatica |
Divisions: | Dipartimenti (until 2012) > DIPARTIMENTO DI INFORMATICA |
Depositing User: | dott.ssa Sandra Faita |
Date Deposited: | 09 Dec 2014 11:19 |
Last Modified: | 09 Dec 2014 11:19 |
URI: | http://eprints.adm.unipi.it/id/eprint/2136 |
Repository staff only actions
View Item |