TF-IDF

Definitions

noun information retrieval A mathematical approximation to the importance of a particular word in a given piece of text.

Abbreviation of "term frequency over inverse document frequency".

Help support Wordnik (and make this page ad-free) by adopting the word TF-IDF.

If something like “bebo” was never searched for before this year and then reasonably high in the top searches this year, TF-IDF would rank it very highly. reply

Google Top Searches: Based on Everything and Nothing Michael Arrington 2005
If you have a collection of documents (or popular search terms for different years), TF-IDF gives you the terms that help to best differentiate one document from the rest.

Google Top Searches: Based on Everything and Nothing Michael Arrington 2005
This looks a lot like TF-IDF which is commonly used in data mining to uncover terms that help to best differentiate one “document” from another, which seems to me to be what something like Zeitgeist should be going for.

Google Top Searches: Based on Everything and Nothing Michael Arrington 2005
Search Basics • Goal: Identify documents that are similar to input query d1 • Lucene uses a modified Vector Space Model (VSM) - Boolean + VSM q1 Θ - TF-IDF - The words in the document and the query each define a Vector in an n-dimensional space dj = - Sim (q1, d1) = cos Θ q = - In Lucene, boolean approach restricts what documents to w = weight assigned to term score

Recently Uploaded Slideshows 2009
\'Traditional web page search does IR / TF-IDF / page rank stuff pretty well on the Web at large, but if you want to do a specific type of search, for restaurants, images, etc., web search isn't necessarily the best option.

Recently Uploaded Slideshows 2009