LSI [BDO95] exploits the factorization of the
matrix A into the product of 3 matrices
using the singular value decomposition (SVD).
Given an
matrix A, where
and
rank(A) = r, the singular value decomposition of A is defined as
where
and
. The first r columns of the orthogonal matrices
U and V define the orthonormal eigenvectors associated with the r nonzero
eigenvalues of
and
, respectively. The columns of
U and V are referred to as the left and right singular vectors,
respectively, and
the singular values of A are the diagonal elements of
or
the nonnegative square roots of the n eigenvalues of
[GL89].
As defined by Equation (3), the SVD is used to represent the original relationships among terms and documents as sets of linearly-independent vectors or factor values. Using k factors or the k-largest singular values and corresponding singular vectors one can encode (see [BDO95]) the original term-by-document matrix as a smaller (and more reliable) collection of vectors in k-space for conceptual query processing.