New Algorithms for Finding Monad Patterns in

Recommend Documents

Faster Algorithms for Finding Missing Patterns

Faster Algorithms for Finding Missing Patterns. Shuai Cheng Li. School of Computer Science. University of Waterloo. Waterloo ON N2L 3G1 Canada.

Algorithms for Finding Lines

First, there are image-space algorithms that render something (such as a ... extract lines by doing some sort of image processing on the framebuffer. (for simple ...

Efficient Algorithms for Finding Submasses in

a substring with mass M. The right pointer is moved if the mass of the ..... mic complexity of protein identification: Combinatorics of weighted strings. DAM (2004) ...

Comparison of two optical cluster finding algorithms for the new ...

Feb 2, 2008 - Donahue et al. (2001, 2002) ..... Scott) algorithm described by Peebles (1980). The number .... sponds to R = 3 and nâ = 80 to R = 4. Such rich ...

Algorithms for discovering repeated patterns in ... - CiteSeerX

May 23, 2001 - However, there are certain types of interesting musical repetitions that ... be used to build useful software tools for music analysts and composers. ..... w of a string x is called a seed of x if x is a factor of a string z which is c

Efficient Algorithms for Finding Critical Subgraphs

La publication de ces rapports de recherche bÃ©nÃ©ficie d'une subvention du ..... to the initially empty collection U. The algorithm then calls an exact ..... coloring c1 has a set U(c1) containing the three edges forming a triangle in the center of.

Lower Bounds for Distributed Maximum-Finding Algorithms

building block in more complex algorithms, is that of finding the maximum of a ...... HIRSCHBERG, D S., AND SINCLAIR, J.B. Decentrahzed extrema-findlng in ...

Lower Bounds for Distributed Maximum-Finding Algorithms

needed to find the maximum label in a circular configuration of n labeled ... building block in more complex algorithms, is that of finding the maximum of a.

Performance Comparison of Algorithms for Finding ... - Washington

Performance Comparison of Algorithms for Finding Transcription Factor Binding Sites. Saurabh Sinha*. Martin Tompa. Department of Computer Science and ...

Algorithms and Orders for Finding Noncommutative ... - CiteSeerX

Dr. Ed Green who introduced me to my research problem, and Dr. Lenny Heath ... Siva Challa, Susan Keenan, Mani Mukherji, Greg Lavender and Ernie Page.

Differential Evolution Algorithms for Finding Predictive ... - CiteSeerX

computational methods to support this expansion are needed. To understand a biological processes that a living cell undergoes, one has to measure the gene ...

Simple, efficient maxima-finding algorithms for multidimensional

Oct 8, 2009 - n vs log n effect (reflecting dependence or independence) on random Cartesian trees. Since the maximal points can be very abundant with ...

Design Patterns in Beeping Algorithms

Jul 11, 2016 - A. Casteigts, Y. MÃ©tivier, J.M. Robson and A. Zemmari. UniversitÃ© de Bordeaux - Bordeaux INP. LaBRI UMR CNRS 5800. 351 cours de la ...

Fast Algorithms for Finding O(Congestion + Dilation)

Fast Algorithms for Finding O(Congestion+Dilation). Packet Routing Schedules. F. T. Leighton. Bruce M. Maggs. Andr ea W. Richa. July 1996. CMU-CS-96-152.

Approximation algorithms for finding and ... - Semantic Scholar

approximation results for independent sets in UDGs to co-k-plexes, and settle a recent ... wireless nodes with associated disk centers cv for each v â V located in the Euclidean plane. Under ..... W.H. Freeman and Company, New York (1979).

ALGORITHMS FOR FINDING GLOBAL MINIMIZERS OF IMAGE ...

gradient descent, and are therefore prone to getting stuck in such local minima. This ... makes the initial guess for gradient descentâbased algorithms sometimes ...

Space-Economical Algorithms for Finding Maximal

the parentheses representation of the tree topology of the suffix tree of a .... order of the edges from the node, and write a close parenthesis. Figure 1 shows the ...

Improved algorithms for finding length-bounded ... - ScienceDirect.com

paths P0, P1,..., Pkâ1 such that for each i, 0 â©½ i â©½ k â 1, Pi is a path from si to ti and the length of Pi is a

Robust algorithms for direction-finding in the presence of ... - CiteSeerX

ferred to as multidimenrional MUSIC) [13], and weighted subspace fitting (WSF) [14, 151. ..... IEEE ICASSP, Glasgow, Scotland, May. 1989. A. Swindlehurst, B.

Two Algorithms for Root Finding in Exact Real ... - Semantic Scholar

proposed a representation of computable real numbers by redundant continued ... base interval 0;1] in the one-point compacti cation R of the real line R. A.

Clique algorithms for finding substructures in generalized ... - CiteSeerX

A (finite) generalized quadrangle (GQ) is an incidence structure S = (P, B, I), in which .... partial clique in the automorphism group of the quadrangle, it suffices to.

Fast Algorithms for Finding Matchings in Lopsided ... - Nikhil Devanur

Jun 7, 2010 - fixed number of impressions and is responsible for making sure that these campaign ..... where φt depends on Ht such that φ ˙m is equal to the fail- ure probability of ..... network flow. SIAM J. Comput., 23(5):906–933, 1994.

Efficient Algorithms for Finding Submasses in ... - Semantic Scholar

Query Problem and the Submass Witness Problem. The storage space required is proportional to Ï(s), the number of different submasses of string s. For the ...

Finding Patterns In Given Intervals - Semantic Scholar

Maxime Crochemore1,2,4, Costas S. Iliopoulos1,3,â, and M. Sohel. Rahman1 ... Supported by EPSRC and Royal Society grants. ... Commonwealth Scholarship and Fellowship Plan (CSFP). ... substring of T if T = uwv for u, v â Î£â; in this case, the s

New Algorithms for Finding Monad Patterns in

Download PDF

0 downloads 0 Views 237KB Size Report

Comment

can be applied to transform a multi-ad problem into a monad problem that is ... signal in a sample of t = 20 sequences, each 600 nucleotides long, each ... Observation 1: For each l-gram, it is sufficient to search the consistent pat- .... Now, let us assume patterns Li and Lj mismatch with ... We do not want to evaluate all the.

New Algorithms for Finding Monad Patterns in DNA Sequences? Ravi Vijaya Satya and Amar Mukherjee [rvijaya,amar]@cs.ucf.edu School of Computer Science, University of Central Florida Orlando, FL USA 32816-2362

Abstract. In this paper, we present two new algorithms for discovering monad patterns in DNA sequences. Monad patterns are of the form (l,d )k, where l is the length of the pattern, d is the maximum number of mismatches allowed, and k is the minimum number of times the pattern is repeated in the given sample. The time-complexity of some of the best known algorithms to date is O(nt2 ld σ d ), where t is the number P of input sequences, n is the length of each input sequence, and σ = | | is the size of the alphabet. The first algorithm that we present in this paper takes d d d O(n2 t2 l 2 ) time and O(ntl 2 σ 2 ) space, and the second algorithm takes d d d 3 3 d O(n t l 2 σ 2 ) time using O(l 2 σ 2 ) space. In practice, our algorithms have much better performance provided the d/l ratio is small. The second algorithm performs very well even for large values l and d as long as the d/l ratio is small.

1

Introduction

Discovering regulatory patterns in DNA sequences is a well known problem in computational biology. Due to mutations and other errors, the actual occurrences of these regulatory patterns allow for a certain degree of error. There fore, the actual regulatory pattern (or the consensus pattern) may never appear in a gene upstream region, but d-mismatch occurrences of this pattern might appear. The general approach to this problem is to take a set of t DNA sequences each of length n, at least k of which are guaranteed to contain the desired binding site, and look for patterns of a certain length l that occur in at least k out of the t sequences with at most d mismatches at each occurrence. The values of l, d and k can be determined either from prior knowledge about the binding site, or by trial and error, trying different values of l and d. These single contiguous blocks of patterns are called monad patterns. In general, many regulatory signals are made up of a group of monad patterns occurring within a certain distance form each other [EskKGP03, EskP02, GuhS01, vanRC00]. In such a case, the patterns are called dyad, triad, multi-ad, or in general as composite patterns. Finding the composite patterns by finding the component monad patterns individually is significantly more difficult, since ?

This research was partially supported by NSF grant number: ITR-0312724.

D( L

j ,P

)