Does SEO Work? We Analyzed 225 Million Keywords (In-depth study)

It's been a while since we last published some SEO data analysis; happy to see you around!

I’ve been inspired by this case study that pointed out that higher ranked pages in the SERPs are affected by SEO.

Many users of web search engines have been complaining in recent years about the supposedly decreasing quality of search results. This is often attributed to an increasing amount of search-engine optimized but low-quality content. 

- from the study, Is Google Getting Worse? A Longitudinal Investigation of SEO Spam in Search Engines

I teamed up with Jan Kulbiński from Bards.ai, who as a data scientist, has worked with me on the Topical Map tool for over a year now.

We know each other pretty well now, and it was just one spark of an idea to recreate a similar analysis on a much larger sample.

So, do we, as the SEO community, have such a big impact on search engine results?

At Surfer, we have a database of SERPs that, for the US market itself, range up to 200 million.

Such a large sample can prove the theory that optimization for search really influences page rankings by establishing a correlation. 

However, the limited information in the SERP allowed us to analyze only a few factors:

  • Keyword in title
  • Keyword in description
  • Keyword in URL

The first one seemed really tempting. The keyword in the title is a super strong ranking factor — if a page has it, we can say it is optimized for search. 

But there is a problem with this factor and the keywords in the description—Google replaces them too often.

We can’t use a sample that is manipulated because we do not know if the keyword in the title comes from Google’s change or if it was the intention of an SEO to add the keyword there to optimize the page for search engines. 

How to determine if a page was optimized for search engines

Remember the keyword in the URL?

Google can’t change this one, so the hypothesis goes like this:

The keywords in the URL indicate that someone with at least a basic understanding of SEO touched the page. Even as basic as changing the default URL type in Wordpress. 

Data sample for the analysis

Total number of keywords: 225,618,248 (Over two hundred and twenty five million keywords)

Total number of affiliate-related keywords (best, Vs, alternative, review etc.): 8,351,235

Total number of keywords with monthly search volume > 1000: 15,950,992

Location: US

SERP Positions tracked: top 50

We were looking for the exact match of a keyword in the URL and partial matches - words from the search query that could also be used in different forms but were closely related. 

The results speak for themselves; the charts are below.

Correlation between search engine optimization and SERP ranking positions

Other data scientists from the team could not believe the correlation between the affiliate and whole dataset samples was so strong. 

But is it surprising to you? 

To me, it is exactly what I expected. Google promotes pages that are optimized for the search engine; it really does. 

The whole sample shows a strong correlation between having keywords in the URL and higher rankings in SERP. 

When you look closer at the red chart depicting keywords with a search volume above 1000 users a month, the top spots correlate less. 

Still strong, but it looks like other ranking factors are applied in the top 3 positions. 

Regarding the affiliate niche - it is the most optimized and has the strongest correlation in the dataset. 

Look at the values on Y axis, big search volume keywords and affiliate keywords have 0.65, while the whole dataset peaks at 60ish. 

It means that the affiliate data has more keywords in the URLs than the whole database - meaning these groups are more optimized for search engines.

Hence, search results for affiliate keywords are the most influenced by SEOs. 

Takeaways

Is Google getting worse because of us, as the original study states? 

I don’t think so. 

Even though our work influences search engine results, Google uses many behavioral signals to determine the overall rank of a page.

Optimized pages would not be there if they were not satisfying the user intent.

Good job, everyone involved!

No items found.
Like this article? Spread the world