TF-IDF: What Kind of Beast Is That?
For a human brain, it doesn’t take any math to tell what my article is about. It’s about TF-IDF, right? But when relevancy is evaluated (and, most importantly, compared for several articles) by a machine, we need a numeric representation to see that:- Article A is about TF-IDF (as opposed to, say, link building).
- Article A is more about TF-IDF than article B.
- Term Frequency = (count of the term) / (total word count in the document)
- Inverse Document Frequency = log (number of docs) / (docs containing keyword)
Is Google Using TF-IDF as a Ranking Signal?
The short answer is “no.” TF-IDF is referred to in a number of Google Patents as something that the search engine may use for stop words removal, which is to get rid of all the function words within a search query and in page content: But using this exact mechanism for identifying and comparing relevancy is very unlikely. Simply because being an example of a lexical search mechanism, TF-IDF is unable to look beyond keywords. The model considers keywords as strings of characters and cannot identify semantic relations between them, as opposed to semantic search models most probably used by Google. In other words, TF-IDF itself is not a ranking signal that determines your page’s position. There’s no expected TF-IDF value you need to match for each keyword in your content. And you’d better run from anyone trying to convince you otherwise.Semantic Search & Co-Occurrences
So, Google has moved to semantic search, trying to match the meaning of a search query to topically relevant content, as opposed to matching query keywords to the same keywords on pages. In practice, this means that instead of counting keywords themselves, Google started counting co-occurrences, using the surrounding context to understand their meaning. For example, let’s say you encounter the following sentences and you have no idea what a trout is:- Trout is rich in omega-3 fatty acids.
- Trout has tender flesh and a mild, somewhat nutty flavor.
- When choosing trout we pay attention to a clear red-orange color.
- Salmon is a popular type of fish in Western cuisine, which goes well with white wine.
- Tender salmon meat can be added to pasta.
- Salmon skin is super nutrient-dense, so keep it why you cook.
How Can TF-IDF Help Your SEO?
Finding co-occurring terms is exactly where TF-IDF comes into play. Sure, we don’t have access to every webpage, as Google does. But why would we need those? To get a whole list of co-occurrence ideas, it is perfectly enough to look at a bunch of pages (say 20 to 30). And the beauty is that using TF-IDF isn’t rocket science. All you have to do fits in three simple steps.1. Write Your Content
I’m not urging you to make TF-IDF the purpose of your piece of content. In the end, unnatural writing simply won’t convert even if the page ranks high and brings in the needed traffic. So, first of all, you sit down and write about whatever it is that you have on your content plan.2. Plug in a TF-IDF Tool
Most of the tools I’ve seen work pretty similarly. You enter a URL and the keywords you want to optimize it for. The tool then checks pages that rank on Google for that keyword, parses their content, calculates TF-IDF for all the terms it finds and compares your content stats to those of your competitors. With basic tools, like Seobility, you will get a single-keyword list. If you’re using SEO PowerSuite’s WebSite Auditor, Ryte or Text Tools, you will also have a list of key phrases (or N-grams, if you like a taint of science), which is definitely more informative. (Disclosure: I work for SEO PowerSuite.)3. Enrich Your Content with TF-IDF Co-Occurrence Suggestions
Some of the phrases will simply be synonymous with what you already have in your content. If appropriate, try using them along the way. Some of the phrases will point out the new topics, which haven’t crossed your mind yet. Sift through the ideas and think of ways to use them in your content (without getting obsessed about them).TF-IDF for Keyword Research
A little bonus tip. Picking up the most widely used terms from your competitors’ content might also spur new ideas into your keyword research and content planning, especially when you feel the need for out-of-the-box thinking and inspiration.Conclusion
Many a time, you’ll see TF-IDF used as clickbait – articles either promising the formula to be “Google algorithm reverse-engineered” or “busting the myth of TF-IDF”. But I encourage you to take things for what they are and use the opportunities TF-IDF optimization gives. Without betting your entire SEO campaign on it. More Resources:Image Credits Featured Image: Created by author, October 2019 All screenshots taken by author, October 2019
https://www.businesscreatorplus.com/tf-idf-can-it-really-help-your-seo-via-ab80/
No comments:
Post a Comment