Title

Algorithms for Data Science

Effective Term

2022 Fall Quarter

Learning Activities

Lecture: 3 hours

Discussion: 1 hour

Description

Algorithms for searching, pattern matching, combinatorial problems, clustering, and time series analysis with practical emphasis. Not open for credit for students who have completed ECS 122A. GE Prior to Fall 2011: SciEng. GE: SE.

Prerequisites

ECS 017 or ECS 020; ECS 032B or ECS 036C

Credit Limitation

Not open for credit for students who have completed ECS 122A

Enrollment Restrictions

Not intended for Computer Science or Computer Science & Engineering majors.

**Summary of Course Content:**

- Searching, pattern matching, hashing.
- Algorithm design techniques and applications: divide-conquer, greedy, dynamic programming.
- Algorithm efficiency, complexity, survey of intractability and NP-completeness.
- Selected Data Science topics: clustering, text processing, time series analysis

*Goals*: Students will: (1) learn methods for designing efficient algorithms and evaluating their performance, (2) learn standard algorithms for fundamental problems, and (3) gain experience applying algorithms to common scenarios in data science.

**Computer Usage/Programming homeworks:**

Each homework includes problems related to the basic mathematical and algorithmic concepts and techniques discussed in class. Students will use R or Python, and associated data processing libraries, as a problem-solving environment for case studies drawn from common data science scenarios. With these the students will develop appreciation for the connection between algorithm theory and data science problems and solutions. The programming homeworks will enhance the learning of R/Python which will be useful in a data scientist’s career.

**Engineering Design Statement:**

Written assignments often (25-50% of time) contain problems involving the design and analysis of algorithms for a particular type of computer system. The students are given a general description of the problem to be solved and the general parameters of the computer system which is to be used to solve the problem. The students are then expected to design a detailed solution consisting of both standard routines and new routines which they design. They are also expected to justify the correctness of their solution and to analyze its expected performance. Examination questions also test the design and analysis techniques learned through the homework assignments.

**Illustrative Reading:**

T. Cormen, C. Leiserson, R. Rivest, and C. Stein. *Introduction to Algorithms*, 3rd edition.

**Potential Course Overlap:**

Clustering is also covered in ECS 111 and ECS 170/171, but more theoretically, while this in course the emphasis will be more practical. Algorithms are also covered in ECS 122A, but from a theoretical perspective, while in this course the emphasis will be on practical approaches and exercises.

**Final Exam:**

Yes Final Exam

## Course Category