Item Details

Print View

Inverse Document Frequency and Web Search Engines

Prey, Kevin; French, James; Powell, Allison; Viles, Charles
Format
Report
Author
Prey, Kevin
French, James
Powell, Allison
Viles, Charles
Abstract
Full text searching over a database of moderate size often uses the inverse document frequency, idf = log(N/df), as a component in term weighting functions used for document indexing and retrieval. However, in very large databases (e.g. internet search engines), there is the potential that the collection size (N) could dominate the idf value, decreasing the usefulness of idf as a term weighting component. In this short paper we examine the properties of idf in the context of internet search engines. The observed idf values may also shed light upon the indexed content of the WWW. For example, if the internet search engines we survey index random samples of the WWW, we would expect similar idf values for the same term across the different search engines.
Language
English
Date Received
2012-10-29
Published
University of Virginia, Department of Computer Science, 2001
Published Date
2001
Collection
Libra Open Repository
In CopyrightIn Copyright
▾See more
▴See less

Availability

Access Online