Figure 10 reports the average number of seconds required to search for both terms and documents in each of the four document collections using lsiQuery and lsiFinder. Because of memory constraints, lsiFinder was unable to search for terms in the USENET document collection.
Although the logarithmic scale of the graph conceals much
of the true time differences between lsiQuery and
lsiFinder, lsiQuery outperformed
lsiFinder on all the document collections tested,
especially the larger collections. Significant time
differences (as much as an average of
seconds when searching
for related documents in the USENET collection) were
observed for the CCE and USENET collections. Greater
time differences are expected for even larger document collections.
Empirical data suggests that as much as
of the total
time required by lsiFinder to process a query was
spent in one program called by mlsisearch, syn.
Syn, which computes the cosine between the query vector and the term
and document vectors, doesn't free or reuse memory that has already
been allocated, but instead allocates memory for each
vector as it is loaded. Not only does the continual
allocation of memory slow syn's execution, it also
prevents large document collections (such as the USENET
document collection) from being searched.