Google's Helpful Content Updates, Quality Rater Guidelines & The "Information Gain Scores" Patent

   Google makes moral judgements: that websites "should be created to help people" and demonstate "beneficial purpose" (a.k.a. "helpful content"). See the guidance from the Google Quality Raters' Guidelines (https://static.googleusercontent.com/media/guidelines.raterhub.com/en//searchqualityevaluatorguidelines.pdf):


" The purpose of a page is the reason or reasons why the page was created. Every page on the Internet is created for a purpose, or for multiple purposes. Most pages are created to be helpful for people, thus having a beneficial purpose. Some pages are created merely to make money, with little or no effort to help people. Some pages are even created to harm users. The first step in understanding a page is figuring out its purpose. Why is it important to determine the purpose of the page for Page Quality (PQ) rating?


● The goal of PQ rating is to determine how well a page achieves its purpose. In order to assign a rating, you must understand the purpose of the page and sometimes the website.

● By understanding the purpose of the page, you'll better understand what criteria are important to consider when evaluating that particular page.

Websites and pages should be created to help people. If that is not the case, a rating of Lowest may be warranted..." 

   In Jun 2020 the (now sadly deceased) Bill Slawski (thanks again Bill for all your help) wrote a blog on a 2018 patent application r.e. ranking search results based on "information gain" scores.

   This Google patent was aimed at addressing the problem "…when a set of documents is identified that share a topic, many of the documents may include similar information." and appears to relate to Google's Helpful Content updates of 2022 and 2023 ("part of a broader effort to ensure people see more original, helpful content written by people, for people, in search results"). These latest update appear to be an admission that their Panda update in February 2011 designed at “provide better rankings for high-quality sites - sites with original content and information such as research, in-depth reports, thoughtful analysis and so on”  has not worked / does not work well enough today. 

  R.e. the 2018 patent application r.e. ranking search results based on "information gain" scores.  For example, a Google user may submit a question about a topic and be “provided with multiple documents that include a similar listing of solutions, remedial steps, resources, etc” e.g. a list of websites listing a product with pages based on the manufacturer's product description.

   The patent outlined ways of "determining an information gain score for one or more documents of potential interest to the user and presenting information from one or more of those documents that are selected based on their respective information gain scores."

  An "information gain" score for a document was a metric of “additional information included by a page beyond the information contained in other pages already presented to the user.” 

   See https://gofishdigital.com/blog/information-gain-scores/ for the detail.

This update looks like a development further from the recent Google Product Reviews update: see related research:

"An information gain-based approach for recommending useful product reviews" (2010)

https://link.springer.com/article/10.1007/s10115-010-0287-y 

    For the metric of "helpfulness" of online reviews the study below says:

"According to the text characteristics of online reviews, the model chooses three features of online reviews: the number of words, the helpful value of words, and the number of product features, to construct a model for predicting the helpfulness of online reviews. The helpful value is the information gain of words to distinguish the helpfulness of online reviews ... The experimental results show that with the increase of the number of words, helpful value of words and the number of product features, the helpfulness of reviews increases continuously." See https://www.jsjkx.com/EN/10.11896/jsjkx.190700034 .

   Here's an archive of Bill's site for his analyses of other Google patents etc: www.seobythesea.com