Description
Home Description Grades Project Links

 

CS 460/594 Information Storage & Retrieval
(3 hours credit)

The focus of this course is on traditional and evolving technologies for the storing and retrieving of both full-text and multimedia from large databases. Particular emphasis is given to the data structures and algorithms needed to build efficient search engines and interfaces for the World Wide Web (WWW). Topics covered include: data compression, query and file structures, vector space models, performance evaluation, and security. Students will participate in a class project involving both the creation and management (includes indexing and updating) of a large document collection on the WWW.

Prerequisites:

CS302 and CS360
or permission of the instructor (contact M. Berry).

Graduate Credit:

Graduate students may register for this course as CS 594 with the understanding that additional work (on homework, exams, or class project) may be assigned to 594 students.

Film Schedule:

Three films from the PBS series Nerds 2.0.1: A Brief History of the Internet will be shown during the semester. The schedule and location of these films (shown during regular class periods, 8:10-9:25am) is listed below. Attendance is mandatory.

Title Date Location
Vol. I: Networking The Nerds Feb. 11 Hodges Lib. Aud.
Vol. II: Serving The Suits Mar. 25 Hodges Lib. Aud.
Vol. III: Wiring The World Apr. 22 Hodges Lib. Aud.

Textbook(s):

Information Storage and Retrieval (First Edition) by Robert R. Korfhage, Wiley Computer Publishing, © 1997, Winner of Best Information Science Book of 1998 Award from the American Society for Information Science.

Understanding Search Engines: Mathematical Modeling and Text Retrieval by Michael W. Berry and Murray Browne, SIAM Book Series: Software, Environments, and Tools, Philadelphia, PA, © 1999, (In Press).

List of Topics:

Overview/Introduction (K,B&B)
Document and Query Forms (K)
Document File Preparation (B&B)
Query Structures (K)
Query Matching (K)
Text Analysis (K)
Vector Space Models (B&B)
Matrix Decompositions (B&B)
Cluster Analysis (K)
Ranking and Relevance (B&B)
Effectiveness Measures (K)
User Interfaces (B&B)
 
(K=Korfhage, B&B=Berry & Browne)