This is part 2 of a series of posts I’m writing to learn more about what we can learn from the information in the recently discovered information Google documentation on API references.
We started this yesterday Learn more about the files that was found possibly Discuss important factors in Google’s ranking systems.
We learned that these files do not tell us the details of Google’s ranking system, but rather list attributes possibly could be used in ranking. These files are documentation intended to help developers working with Google’s Cloud Platform API. I suspected that the attributes mentioned were information that could be used in Google’s machine learning systems, which we’ll hopefully learn a lot more about by the end of this series!
Attributes
Attributes are a way to structure information in a consistent form that can be used across different programs. In our case, each attribute is information that can be used in some way through Google’s APIs. These APIs are tools that developers can use to programmatically access resources available on Google’s cloud platform. For example, if you build an app that uses the Gemini model to chat with users, you would interact with the Gemini API.
These APIs use a common language and structure to describe data that can be used in different applications. Any information called a attributecan be used across multiple APIs and machine learning models accessible via the cloud.
For example, if I were developing using Google’s APIs and wanted to use the named attribute LocationIt has a clearly defined structure containing a string. In the documentation, this attribute defaults to “The World”, but could be set by the user to indicate a specific location.
Some of the attributes listed in these APIs could used in search algorithms
It is very likely that the attributes mentioned in these developer APIs are attributes that can be used by Google’s search systems.
For each of the attributes below, we know that this is information that Google can store. What we don’t know is whether it is actually used in the ranking and if so, what role it plays.
Here are some examples of attributes that are very interesting when used in Google’s search systems.
- Contents: Stores content mapping to give credit to the content. “This information is used in ranking to promote the attributed page.”
- QualityTravelGoodSitesData: Stores data about good travel sites.
- IndexingMobileInterstitialsProDesktopInterstitials: a signal related to interstitials.
- SpamBrainData: “This contains SpamBrain values.” Also, SpamBrainScore
- QualityTimebasedLastSignificantUpdateAdjustments (although there is a note in the code saying this is deprecated.)
- FatcatCompactTaxonomicClassificationCategory – The probability that a document belongs to a certain category.
- CompressedQualitySignals – They look interesting! This attribute contains information that can be used for Mustang and TeraGoogle (both topics that I may dig more into in this blog post series – here’s an interesting article that mentions the following). TeraGoogle as a massive search index launched in 2006. Gemini told me Mustang is Google’s primary web search index – the workhorse behind our everyday Google searches.)
How would attributes be used in search?
Attributes represent certain characteristics. These features can be used on their own as so-called Signalsor they can be used in algorithms that produce a signal.
What is a signal? Signals are incredibly important in Google’s algorithms.
In Google’s documentation This is how the search works They tell us that “Search algorithms take many factors into account and Signals.” These signals include the words in your query, the relevance and usability of pages, the expertise of sources, and your location and preferences. They say, “Them Weight applied to each factor varies depending on the nature of your request.”
A signal here is something that can be used in the algorithms and systems that perform calculations to help determine rankings.
Some signals that Google uses for ranking come from the so-called “aggregated and anonymized interaction data to assess whether search results are relevant to search queries.” They say they convert this data into Signals for their machine learning systems Better assess relevance.
I suspect that signals derived from aggregated and anonymized interaction data are likely used by the Navboost system.
First, let’s end with this:
Attributes are information. These can be used in programs that communicate with Google’s resources. Some or all of these attributes may be used in the systems that determine rankings.
In the next blog post in this series, we’ll talk more about Navboost and look at the attributes listed in these documents, such as: NavboostCrapsCrapsClickSignals, And NavboostGlueVoterTokenBitmapMessage.
Here is the first episode of this series: What is this leaked Google code? Browse the API docs.
And here’s the next one: Navboost.
Stay tuned for more!
Mary
(Follow me as I document everything I find interesting and important on this and other topics related to ranking and AI Marie’s notes.)
Join my community, stay informed, and get excited about the future of AI the search bar.
Brainstorm with Marie