Show/hide main menu

Modules

6CCS3TSP Text Searching and Processing

Credit value: 15

Lecturer: Professor Costas Iliopoulos and Dr Solon Pissis (office hours)
Semester: 2
Teaching pattern: weekly 3-hour lecture
Assessment: 100% written examination, 2 hours (Marking Model 2 - Double Marking)

Learning aims & outcomes

This unit is devoted to algorithms processing strings and texts efficiently. These types of algorithms are used for software design in the domains of operating systems utilities, search engines on the Internet, data retrieval systems, analysis of genetic sequences, and natural language processing, for example.

On completing the module you will have the basic tools to design exact and efficient algorithms for matching patterns in textual data and building indexes on files.

Syllabus
Introduction:
  • Basic concepts
  • Periods in strings
  • Finite automata and regular expressions
Exact pattern matching:
  • Brute-force algorithms for pattern matching
  • The Knuth-Morris-Pratt algorithm
  • The Boyer-Moore algorithm
  • The Karp-Rabin algorithm
  • Horspool's algorithm
Multiple pattern matching:
  • The Aho-Corasick automaton
Structures for indexes:
  • Suffix arrays
  • Suffix trees 
Suggested reading/resources

 


07 September 2017
Sitemap Site help Terms and conditions  Privacy policy  Accessibility  Modern slavery statement  Contact us

© 2018 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454