Definitions
from Wiktionary, Creative Commons Attribution/Share-Alike License.
- noun information retrieval A mathematical approximation to the
importance of a particular word in a given piece of text.
Etymologies
from Wiktionary, Creative Commons Attribution/Share-Alike License
Support
Help support Wordnik (and make this page ad-free) by adopting the word TF-IDF.
Examples
-
If something like “bebo” was never searched for before this year and then reasonably high in the top searches this year, TF-IDF would rank it very highly. reply
Google Top Searches: Based on Everything and Nothing Michael Arrington 2005
-
If you have a collection of documents (or popular search terms for different years), TF-IDF gives you the terms that help to best differentiate one document from the rest.
Google Top Searches: Based on Everything and Nothing Michael Arrington 2005
-
This looks a lot like TF-IDF which is commonly used in data mining to uncover terms that help to best differentiate one “document” from another, which seems to me to be what something like Zeitgeist should be going for.
Google Top Searches: Based on Everything and Nothing Michael Arrington 2005
-
Search Basics • Goal: Identify documents that are similar to input query d1 • Lucene uses a modified Vector Space Model (VSM) - Boolean + VSM q1 Θ - TF-IDF - The words in the document and the query each define a Vector in an n-dimensional space dj = - Sim (q1, d1) = cos Θ q = - In Lucene, boolean approach restricts what documents to w = weight assigned to term score
-
\'Traditional web page search does IR / TF-IDF / page rank stuff pretty well on the Web at large, but if you want to do a specific type of search, for restaurants, images, etc., web search isn't necessarily the best option.
Comments
Log in or sign up to get involved in the conversation. It's quick and easy.