The SEO world is going crazy with a recent development – a potential glimpse into the inner workings of Google Search! Our team have spent the last day breaking down the key takeaways and their implications for the SEO world.

The Leak: Thousands of documents, believed to be part of Google’s internal Content API Warehouse, were recently released online on Github. SEO experts Rand Fishkin and Michael King analysed the documents, offering valuable insights.

Why it Matters: This leak, potentially one of the biggest in SEO history, provides insight into the factors Google might consider when ranking content and their ranking algorithm overall. It allows us to better understand the “why” behind Google’s search results.

What We Learned:

  • Ranking Features: The documents showcase 2,596 modules and 14,014 attributes potentially used for ranking! However, the weighting of these factors remains unknown.
  • Twiddlers: Google might employ “twiddlers” – re-ranking functions that can adjust document rankings based on various factors. 
  • Demotions: Content can also be demoted for reasons like:
    • A link doesn’t match the target site
    • SERP signals indicate user dissatisfaction
    • Product reviews
    • Location
    • Exact match domains
    • Adult Content
  • Change History: Google reportedly maintains a historical record of every indexed webpage, potentially “remembering” all changes. However, Google only uses the last 20 changes of a URL when analysing links. 
  • Links: PageRank for a website’s homepage holds weight, links remain crucial for ranking success. Link diversity and relevance are still key, emphasising the importance of high-quality backlinks. 
  • Successful Clicks: Google likely uses click data (good clicks, bad clicks, etc.) as a ranking signal. Creating valuable content and positive user experiences are crucial for attracting successful clicks.
  • Brand & Entities: Brand recognition and author credibility are significant ranking factors. Google appears to track author information and attempt to verify content authorship.

Other Interesting Findings:

  • Freshness matters: Google considers publication dates in bylines, URLs, and content.
  • Document relevance: Google compares page content to website focus to determine topical relevance.
  • Chrome data: Google might utilise data from Chrome browsing for ranking purposes.
  • Whitelisting: Google may whitelist certain domains (e.g., election-related) to ensure accurate results.
  • Small sites: Google might have mechanisms to adjust rankings for small personal websites.

A Note on Weighting: While the documents reveal numerous ranking factors, their relative importance remains unclear. Weighting is a crucial puzzle piece missing from this picture.

Google’s Response: Google downplayed the leak, emphasising the lack of context provided by the documents. They maintain their focus on delivering high-quality search results based on a complex algorithm.

Looking Forward: While the exact implications of the document leak remain under discussion, this leak offers valuable insights into Google’s potential ranking considerations. By focusing on high-quality content, user experience, and brand building alongside strategic link acquisition, SEOs can continue to improve their search engine optimisation efforts for their clients and businesses.