Table of Contents
-
Preface
-
1 Introduction
- 1.1 Document File Preparation
- 1.1.1 Manual Indexing
- 1.1.2 File Cleanup
- 1.2 Information Extraction
- 1.3 Vector Space Modeling
- 1.4 Matrix Decompositions
- 1.5 Query Representations
- 1.6 Ranking and Relevance Feedback
- 1.7 User Interface
- 1.8 Course Project
- 1.9 Final Comments
- 2 Document File Preparation
- 2.1 Document Purification and Analysis
- 2.1.1 Text Formatting
- 2.1.2 Validation
- 2.2 Manual Indexing
- 2.3 Automatic Indexing
- 2.4 Item Normalization
- 2.5 Inverted File Structures
- 2.5.1 Document File
- 2.5.2 Dictionary List
- 2.5.3 Inversion List
- 2.5.4 Other File Structures
- 3 Vector Space Models
- 3.1 Construction
- 3.1.1 Term-by-Document Matrices
- 3.1.2 Simple Query Matching
-
3.2Design Issues
- 3.2.1 Term Weighting
- 3.2.2 Sparse Matrix Storage
- Compressed Row Storage (CRS)
- Compressed Column Storage (CCS)
- 3.2.3 Low-Rank Approximations
- 4 Matrix Decompositions
- 4.1 QR Factorization
- 4.2 Singular Value Decomposition
- 4.2.1 Low-Rank Approximations
- 4.2.2 Query Matching
- 4.2.3 Software
- 4.3 Semi-Discrete Decomposition
- 4.4 Updating Techniques
- 5 Query Management
- 5.1 Query Binding
- 5.2 Types of Queries
- 5.2.1 Boolean Queries
- 5.2.2 Natural Language Queries
- 5.2.3 Thesaurus Queries
- 5.2.4 Fuzzy Queries
- 5.2.5 Term Searches
- 5.2.6 Probabilistic Queries
- 6 Ranking and Relevance Feedback
- 6.1 Performance Evaluation
- 6.1.1 Precision
- 6.1.2 Recall
- 6.1.3 Average Precision
- 6.1.4 Genetic Algorithms
- 6.2 Relevance Feedback
- 7 User Interface Considerations
- 7.1 Guidelines
- 7.1.1 Interface Tools
- 7.1.2 Progress Indication
- 7.1.3 No Penalties for Error
- 7.1.4 Form Filling
- 7.1.5 Display Considerations
- 7.1.6 No Anthropomorphic References
- 7.1.7 Test and Re-Test
- 7.2 Interfaces for Search Engines
- 7.2.1 User's Control
- 7.2.2 Informing the User
- 7.3 In Practice
- 8 A Course Project
- 8.1 Project Approach
- 8.2 Time Frame
- 8.3 File Preparation
- 8.4 Different Term-by-Document Matrices
- 8.5 Query Processing
- 8.5.1 Ranking
- 8.5.2 Relevance Feedback
- 8.6 The User Interface
- 8.7 Project Evaluation
- 9 Further Reading
- 9.1 General Textbooks on IR
- 9.2 Computational Methods and Software
- 9.3 Search Engines
- 9.4 User Interfaces
- References