ECS 224: String Algorithms & Applications in Computational Biology

ECS 224
String Algorithms & Applications in Computational Biology
Effective Term
2016 Spring Quarter
Learning Activities
Lecture - 3.0 hours
Discussion - 1.0 hours
Algorithms that operate on strings. Pattern matching, sets of patterns, regular expression pattern matching, suffix trees and applications, inexact similarity, parametric sequence alignment, applications to DNA sequencing and protein database searching.
ECS 122A
Enrollment Restrictions
Pass One and Pass Two open to Graduate Students in Computer Science only.

Summary of Course Content
I. Exact matching using the Z and the Boyer-Moore algorithms

II. Suffix trees and linear time algorithms for computing them

III. Applications of suffix trees in pattern matching and computational biology

IV. Inexact matching, edit distance and alignment

V. Accelerations for inexact matching

VI. Additional alignment models for biological applications of inexact matching. Gap questions.

VII. Hybrid dynamic programming -- integrating suffix trees with inexact matching

VIII. Database searching

IX. Applications of string algorithms in DNA sequencing problems

X. String algorithms in phylogenetic reconstruction


The project involves some computer implementation of an important algorithm or the theoretical development of some new method.

Illustrative Reading
D. Gusfield, Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology, Cambridge Press, 1997

Potential Course Overlap
There is no significant overlap with any other course.

Course Category