next up previous
Next: LsiBackend versus LsiFinder Up: Serial Results Previous: Timing on the

5.3.2 LsiQuery versus LsiFinder

Figure 10 reports the average number of seconds required to search for both terms and documents in each of the four document collections using lsiQuery and lsiFinder. Because of memory constraints, lsiFinder was unable to search for terms in the USENET document collection.

 Figure 10

Although the logarithmic scale of the graph conceals much of the true time differences between lsiQuery and lsiFinder, lsiQuery outperformed lsiFinder on all the document collections tested, especially the larger collections. Significant time differences (as much as an average of seconds when searching for related documents in the USENET collection) were observed for the CCE and USENET collections. Greater time differences are expected for even larger document collections.

Empirical data suggests that as much as of the total time required by lsiFinder to process a query was spent in one program called by mlsisearch, syn. Syn, which computes the cosine between the query vector and the term and document vectors, doesn't free or reuse memory that has already been allocated, but instead allocates memory for each vector as it is loaded. Not only does the continual allocation of memory slow syn's execution, it also prevents large document collections (such as the USENET document collection) from being searched.



Michael W. Berry (berry@cs.utk.edu)
Tue Jul 23 08:47:48 EDT 1996