UnipiEprints
Università di Pisa
Sistema bibliotecario di ateneo

Masking Patterns in Sequences: A New Class of Motif Discovery with Don't Cares

Battaglia, Giovanni and Grossi, Roberto and Cangelosi, Davide and Pisanti, Nadia (2009) Masking Patterns in Sequences: A New Class of Motif Discovery with Don't Cares. Theoretical Computer Science, 410 (43). pp. 4327-4340. ISSN 0304-3975

[img]
Preview
PDF
Download (248Kb) | Preview

    Abstract

    SUMMARY We introduce a new notion of motifs, called masks, that succinctly represents the repeated patterns for an input sequence T of n symbols drawn from an alphabet. We show how to build the set of all frequent maximal masks of length L in O.2Ln/ time and space in the worst case, using the KarpMillerRosenberg approach. We analytically show that our algorithm performs better than the method based on constant-time enumerating and checking all the potential .jj C 1/L candidate patterns in T , after a polynomial-time preprocessing of T . Our algorithm is also cache-friendly, attaining O.2L sort.n// block transfers, where sort.n/ is the cache complexity of sorting n items.

    Item Type: Article
    Uncontrolled Keywords: Motif inference, Pattern with don't care, Partial order set, Doubling algorithm
    Subjects: Area01 - Scienze matematiche e informatiche > INF/01 - Informatica
    Divisions: Dipartimenti (until 2012) > DIPARTIMENTO DI INFORMATICA
    Depositing User: Dr. Nadia Pisanti
    Date Deposited: 02 Feb 2010
    Last Modified: 20 Dec 2010 11:49
    URI: http://eprints.adm.unipi.it/id/eprint/644

    Repository staff only actions

    View Item