Table of Contents
Enhancing Sitelinks Patent is About Adding Relevant Text to Sitelinks
I have written about Google sitelinks and the patented process behind them. The first one I wrote about was a December 2006 post about Google’s Listings of Internal Sitelinks for Top Search Results That patent helped answer questions such as “what pages get chosen to get linked to using sitelinks” in the past.
I followed that post with one from June 2015, called How Google May Choose Sitelinks in Search Results Based upon Visual or Functional Significance (Updated). That one told us that it may “include sitelinks corresponding to the most visually and functionally significant hyperlinks within the document.”
Related Content:
- Technical SEO Agency
- Ecommerce SEO Agency
- Shopify SEO Services
- Franchise SEO Agency
- Enterprise SEO Services
A patent on sitelinks granted in June 2020 (this one) has a different approach, telling us about enhancing sitelinks. It became based on a provisional patent and an earlier published patent application from 2013. Filed in 2017, granted in 2020, someone at Google has been thinking about it for more than a minute. This post explains about enhancing sitelinks in more detail.
What are the ideas behind enhancing sitelinks? The patent on them starts by telling us:
An online search provider can provide preliminary search results to the online user. Besides the primary search results, the online search provider may also provide more content to the online user. The online content providers associated with the websites included within the search results may get this additional content. For example, a first search result may include a first or primary website associated with a first online content provider. The first online content provider may have many potential additional or secondary content items, incorporating different landing pages that become related or otherwise relevant to the search specified by the online user. It may get challenging for processors to provide secondary co0ntent due to the number of potential items.
With those thoughts in mind, this patent focuses on surfacing those other sources of content:
This description relates to online content and, more particularly, to a method and system for enhancing sitelinks provided by an online content provider and displayed by an online search provider, wherein the enhancement includes adding relevant creative text to the sitelinks.
So, the enhancement refers to “Adding relevant creative text to the sitelinks.”
How is this relevant test chosen, and how is it added to sitelinks, or how does enhancing sitelinks occur?
When an online user performs an online search, an online search provider will provide preliminary search results to the online user. Besides the primary search results, the online search provider may also offer more content to the online user. The online content providers associated with the websites included within the search results may give this extra content. For example, a first search result may include a first or primary website related to a first online content provider.
The first online content provider may have many potential additional or secondary content items, incorporating different landing pages that get related or otherwise relevant to the search specified by the online user.
To make the extra content available to the user, the content provider may choose to provide links to sitelinks presented to the user besides the first or primary results of the search. By providing sitelinks, the user can view the secondary content with a single click, who may get more interested in the extra or secondary content items. Sitelinks have proven to be very successful in providing such other or secondary content to online users. But, for some kinds of online searches, in which the search terms are comprehensive, the sitelinks, by themselves, may not provide enough information, or the most relevant information, to the online user to prompt the online user to click on the sitelink.
As a result, opportunities for presenting more or secondary content to online users may get missed. The online user may not click on any of the sitelinks provided or may not reach the sitelink most relevant to the search specified by the user with a single click. As click counts are essential to online content providers, the loss of clicks is an undesirable outcome. Besides, failure to provide the online user with the most relevant search result is also an undesirable outcome.
It would be desirable to provide a method for matching creative texts to sitelinks to extra or secondary content for presentation to an online user so that the resulting combinations of sitelinks and creative texts are more relevant to the search specified by the online user.
Matching A Sitelink With A Creative (Advertising + Organic SEO)
The patent tells us this about this process as well:
A computer-implemented method for matching a sitelink with a creative gets provided. The procedure gets implemented using a computing device coupled to a memory device. The method includes storing within the memory device many creatives, each creative becoming associated with a uniform resource locator (URL). The method further comprises canonicalizing each URL associated with each of the numbers of creatives. This process includes clustering canonicalized URLs into creative clusters, where each consists of clustered creatives, each with a similar canonicalized URL.
For many years, Google spokespeople have told us about a Chinese wall that exists that separates Paid search from Organic search and that the use of paid search does not affect the rankings of organic search for a site. Since paid search knows the URLs of sites that advertise, and what terms they advertise with, and also knows what terms are used to draw organic traffic to a site, combining information learned about an URL or a site from both paid search and organic search can reveal a lot of information about an URL or a site. Using information from paid searches to add information in enhancing sitelinks for the site is a way of combining that information usefully, and it does not influence the rankings of pages of a site in organic search.
Enhancing Sitelinks By Associating A Selected Creative From The Candidate Set Of Creatives With The Received Sitelink
The method further includes receiving, at the computing device, a sitelink having a sitelink URL associated besides that. The method further comprises canonicalizing the received sitelink URL. The method further includes matching the canonicalized sitelink URL with one of the creative clusters to generate a candidate set of creatives for the received sitelink. The method comprises associating a selected creative from the candidate set of creatives with the received sitelink based on the filter rules and a scoring method.
In another aspect, a computer system gets provided. The computer system includes a processor and a computer-readable storage device having encoded computer-readable instructions that the processor executes. The computer-executable instructions cause the processor to store within the memory device many creatives, each creative becoming associated with a uniform resource locator (URL).
The computer-executable instructions further cause the processor to canonicalize each URL associated with each of the numbers of creatives. The computer-executable instructions further cause the processor to cluster the number of canonicalized URLs into creative clusters. Each creative cluster includes many clustered creatives, each having a similar canonicalized URL associated with that. The computer-executable instructions further cause the processor to receive, at the computing device, a sitelink having a sitelink URL associated besides that.
The computer-executable instructions further cause the processor to canonicalize the received sitelink URL. The computer-executable instructions further cause the processor to match the canonicalized sitelink URL with one of the creative clusters to generate a candidate set of creatives for the received sitelink. The computer-executable instructions further cause the processor to associate a selected creative from the candidate set of creatives with the received sitelink based on one of the filter rules and a scoring method.
In another aspect, computer-readable storage media having computer-executable instructions embodied thereon get provided. When executed by at least one processor associated with a first computing device and a memory device, the computer-executable instructions cause the processor to store within the memory device many creatives, each creative becoming related to a uniform resource locator (URL). The computer-executable instructions further cause the processor to canonicalize each URL associated with each of the numbers of creatives.
The computer-executable instructions further cause the processor to cluster the number of canonicalized URLs into creative clusters. Each creative cluster includes many clustered creatives, each having a similar canonicalized URL associated with that. The computer-executable instructions further cause the processor to receive, at the computing device, a sitelink having a sitelink URL associated besides that.
The computer-executable instructions further cause the processor to canonicalize the received sitelink URL. The computer-executable instructions further cause the processor to match the canonicalized sitelink URL with one of the creative clusters to generate a candidate set of creatives for the received sitelink. The computer-executable instructions further cause the processor to associate a selected creative from the candidate set of creatives with the received sitelink based on one of the filter rules and a scoring method.
In another aspect, a system for matching a sitelink with a creative gets provided. The system includes means for storing within a memory device many creatives, each creative becoming associated with a uniform resource locator (URL).
Pruning The Candidate Set of Creatives
The system further includes means for pruning the candidate set of creatives by removing at least one of the duplicate creatives and redundant creatives.
The means for associating a selected creative from the candidate set of creatives with the received sitelink based on at least one of the filter rules further includes standards for the filter rules, including at least one of demographic rules, language rules, geographic rules, user device rules, platform rules, and advertiser campaign rules.
Canonicalizing the received sitelink URL further includes means for crawling the sitelink URL with and without a URL parameter associated with the sitelink URL, comparing the landing pages, and removing the parameter from the sitelink URL when the landing pages match.
Receiving a sitelink having a sitelink URL associated besides that further includes means for receiving many sitelinks, each having at least one URL associated besides that.
Canonicalizing the received sitelink URL further includes means for processing webmaster supplied rules indicating the relevance of URL parameters.
Canonicalizing each URL associated with each of the numbers of creatives further includes means for comparing contents of landing pages related to the creative-associated URLs to identify similarities amongst the respective landing pages.
Canonicalizing each URL associated with each of the numbers of creatives further includes means for processing webmaster supplied rules indicating the relevance of URL parameters.
Associating a selected creative from the candidate set of creatives with the received sitelink further includes means for associating the creative chosen based on an algorithm in which the received sitelink gets matched with an as-yet unmatched creative having the highest matching score from amongst unmatched creatives.
Associating a selected creative from the candidate set of creatives with the received sitelink further includes means for associating the creative chosen based on an optimal algorithm configured to maximize total matching scores amongst many received sitelinks and as-yet unmatched creatives.
Associating a selected creative from the candidate set of creatives with the received sitelink based on a scoring method further includes means for associating the innovative chosen based on a scoring process, including determining an impression score which indicates many impressions related to the selected creative.
It also means determining an inverse-document-frequency (IDF) score indicating the similarity of terms between a sitelink and a creative text.
The Enhancing Sitelinks Patent
The Enhancing sitelinks patent can get found at:
Canonicalized online document sitelink generation
Inventors: Vaibhav Vaish, Venky Ramachandran, David Philip Sisson, Ramakrishnan Kandhan, Pramod Adiddam, Vinod Ramachandran Marur, and Gaurav Garg
Assignee: Google LLC
US Patent: 10,776,435
Granted: September 15, 2020
Filed: April 19, 2017
Abstract:
Methods and systems for improved processor efficiency via reductions in repeated calculations get provided. Many candidate sitelinks get identified in response to a search for online content.
Each sitelink has associated with it many candidate creatives with which the sitelink may get presented to the user.
The creatives get canonicalized to form clusters of candidate creatives.
The sitelinks get canonicalized.
The creatives get matched to the candidate canonicalized sitelinks to provide enhanced sitelinks having increased relevance to the user search.
How Enhancing Sitelinks Happens
The subject matter described here generally relates to online content and online advertising.
The methods and systems herein enable relevant items of creative text (“creatives”) stored in a content provider database and match with specific sitelinks. The resulting presentation to an online user, referred to as an “enhanced sitelink,” provides more relevant information about the sitelink.
A typical content provider/advertiser may have provided a content-providing network or system with hundreds or thousands of creatives associated with keywords, geographies, or languages.
Some of these creatives may be relevant to a set of sitelinks that the content provider may choose to add to a campaign. The sitelinks do not need to originate from or belong to the same ad campaign, ad group, or other entity as the creatives with which the sitelinks get matched, as long as the sitelinks and creatives get associated with the same content provider. From a content provider standpoint, managing and duplicating the creatives for sitelinks can get burdensome.
Manual management of creative and sitelink matching may also create consistency issues when one of the creatives or campaigns needs to get paused or changed.
The methods and systems described herein may get implemented, including computer software, firmware, hardware, or any combination, wherein the effect may get achieved by performing at least one of the following steps:
- Storing within a memory device many creatives, each creative becoming associated with a uniform resource locator (URL)
- Canonicalizing each URL associated with each of the numbers of creatives
- Clustering the number of canonicalized URLs into creative clusters, wherein each creative cluster includes many clustered creatives each having a similar canonicalized URL associated with them
- Receiving, at the computing device, a sitelink having a sitelink URL associated with it
- Canonicalizing the received sitelink URL
- Matching the canonicalized sitelink URL with one of the creative clusters to generate a candidate set of creatives for the received sitelink
- Associating a selected creative from the candidate set of creatives with the received sitelink based on at least one of filter rules and a scoring method
- Pruning the candidate set of creatives by removing at least one of duplicate creatives and redundant creatives
- Crawling the sitelink URL with and without a URL parameter associated with the sitelink URL, comparing the landing pages, and removing the parameter from the sitelink URL when the landing pages match
- Receiving many sitelinks, each having at least one URL associated with them
- Processing webmaster supplied rules indicating the relevance of URL parameters
- Comparing contents of landing pages associated with the creative-associated URLs to identify similarities amongst the respective landing pages
- Determining an impression score which indicates many impressions associated with the selected creative
- Determining an inverse-document-frequency (IDF) score indicates the similarity of terms between a sitelink and a creative text
An Advertisement Management System (AMS), And User Access Devices Used By Users
This patent goes into considerable detail about how information flows through the Advertising Management System, and if you work in paid search, you may find some value in reading through the patent to learn more about how it collects data and assigns content to slots.
Representative Search Results Arising From A Search For ‘Items’ Specified By A User
As described above, when a user is performing an online search, an advertising system (such as AMS) associated with the search provider will provide more content in advertisements to become presented to the user. For example, a search specified by a user may yield a direct result, which is a link to an advertiser-specified landing page.
This patent tells us that such content can appear in a sitelink for the advertised landing page:
To supplement the primary result, the content provider and AMS provide several extra or secondary results sitelink, and the development includes a sitelink. But, the texts appearing in sitelinks alone may not contain sufficient information to enable a user to decide whether to click on any of the sitelinks.
According to the present disclosure, an example screenshot shows representative search results arising from a search for “items” specified by a user. A search specified by the user yields a preliminary result, which is a link to an advertiser-specified landing page.
The primary result includes a link, as well as a descriptive creative. Besides immediate results, the search may also yield several more or secondary results, including sitelinks. Appearing with sitelinks are explanatory texts (referred to as “creatives” or “creative texts”).
For example, the more or secondary result includes sitelink and creativity. The development consists of sitelink and imagination, the mark consists of sitelink and innovation, and the result comprises sitelink and invention. AMS attempts to provide the most relevant advertisements to users.
A content provider may have many landing pages with associated URLs (forming different or secondary results, for example) that are relevant to the search ordered by the user. Each landing page (and associated URL) has creative texts related to it.
These creative texts may also become associated with other landing pages (and URLs) within the advertiser website. In an example embodiment of the present disclosure, AMS matches creatives with URLs to increase the relevance of the combined creative and URL to the search conducted by the user.
Enhancing Sitelink With Creative Content
The method gets described in a search from a user, with a preliminary result and many more secondary results (also referred to as the “extension”). However, some of the steps such as the storing and canonicalization of creatives and sitelinks may get performed before a search by a user. As used here, “canonicalization” refers to a process for converting data that has more than one possible representation into a “standard,” “normal,” or canonical form.
AMS generates and stores in a creatives database a candidate set of creatives associated with each advertiser specified sitelink. Each sitelink has a URL associated with it, as does each creative. This process can compare different representations for equivalence, count the number of distinct data structures, improve the efficiency of various algorithms by eliminating repeated calculations, or make it possible to impose a meaningful sorting order.
In one embodiment of the present disclosure, AMS associates all creatives with the same URL as the sitelink as part of a candidate set. In another example embodiment, AMS canonicalizes the creative URLs to identify characteristics of the URLs that will enable the creative URLs to become grouped into creative clusters. Canonicalization of creative-associated URLs may become accomplished through various schemes, including analyzing landing pages associated with the sitelinks to compare the contents of the respective landing pages to identify significant similarities amongst the landing pages, wherein important similarities get determined using predefined rules or parameters.
In an alternative scheme, canonicalization of creative-associated URLs includes crawling the creative-associated URL with and without a URL parameter associated with the creative-associated URL, comparing the landing pages, and removing the parameter from the creative-associated URL when the landing pages match. Besides, canonicalizing creative-associated URLs may include applying webmaster-supplied rules that state the relevance of URL parameters. After canonicalization, AMS forms creative clusters, such that creatives within a cluster share the same canonical URL. AMS then saves the collections in a cluster lookup table.
AMS stores data about advertising campaigns in a campaign database or repository. The campaign data includes sitelinks associated with various landing pages. Before matching sitelink URLs to creative URLs, sitelink URLs get canonicalized to identify and account for advertiser redirects and other parameters in the URLs used for recording and reporting site activity that is otherwise inconsequential to directing a user to the final landing page associated with each URL.
In one embodiment of a canonicalization scheme, AMS identifies links URL parameters that are not important to identify a corresponding landing page by crawling the link URL with and without the parameter and then comparing the landing pages in each iteration for matches.
Where landing pages match after crawling, the parameter gets removed. In an embodiment, AMS applies webmaster-supplied rules about the relevance of URL parameters for landing page purposes. Following a user search, AMS identifies many relevant sitelinks and refers to the saved creative cluster lookup table for clusters of candidate creatives to match the specified sitelinks.
Besides, or as an alternative to associating sitelinks with ad campaigns, sitelinks may also become associated with other entities, such as ad groups.
Once a candidate set of creatives has gotten identified, AMS applies specific filter rules and policy checks related to the suitability of the creative for the particular sitelink, to drop matches of creatives to sitelinks that get excluded, or get found inappropriate, relative to the search specified by the user. Examples of filter rules or policy checks include demographic, geographic, and language checks (to ensure that the creative is appropriate to user device location, for example), user device and platform rules, as well as checks to ensure that the candidate creatives are compatible with the status of the ad campaign.
In another example embodiment, AMS prunes the set of candidate creatives by performing deduping to remove redundancies and duplication between creatives found for a specific sitelink and by applying other policy checks such as the size of the available candidate set and estimated measures of the improvement to ad CTR (“click-through-rate”). After a set of candidate creatives for possible matching with a specified sitelink has gotten identified, AMS performs creative matching. After a creative has gotten matched to a specified sitelink to create an enhanced sitelink, AMS saves data representing the enhanced sitelink, to become served to the user.
Often, there can become several creative variants that could become matched with a given sitelink. In an example embodiment, AMS performs creative matching by generating permutations of matches between specified sitelinks and corresponding candidate creatives, and assigning scores to each permutation that reflects a relative value of the “fit” of each proposed match.
The score may become based on various signals such as an impression score, which is a measure of how many times that creative got shown over a recent timeframe such as a week, and an IDF-score, which is a measure of similarity of terms between a sitelink text and a creative text.
In one embodiment, in which two or more sitelinks are being matched with two or more creatives, all applicable creatives that get associated with each candidate sitelink get ordered, based on impression count.
As used herein, “impression count” refers to the number of times that an item, such as a creative, has gotten presented to online users. Then, each sitelink gets matched with a yet unmatched creative with the highest score.
The remaining sitelinks are matched with the next highest scoring creatives until all sitelinks get matched with creatives. Using such a method may result in the maximization of an individual match, at the expense of optimal matching of a group of sitelinks to a set of candidate creatives. In one embodiment, if using this method, a total match score gets determined that is below a predefined threshold, AMS uses a more globally optimal matching algorithm.
The globally optimal matching algorithm gets implemented as follows. For example, two sitelinks S1 and S2 are to become provided to a user in response to a search. Assume S1 can get matched to two (non-duplicative) creatives C1 and C2, having match scores of 10 and 8. Likewise, S2 can also become matched to C1 and C2, but with corresponding match scores of 8 and 2, In a matching scheme in which a first creative match score gets optimized, the resulting association is S1-C1 and S2-C2 with a score of 12.
Enhancing Sitelinks Conclusion
The patent goes into a look at link analysis, and bipartite graphs and focuses more on how advertised content can be used as a creative for an enhanced sitelink. The combination of information from advertisements and organic search results to provide richer sitelinks is interesting, and something that eluded me, because it felt like the wall between organic and paid search would prevent such information from being shared. According to this patent, and one that I looked at earlier: How Google May Create Augmented Content,the sharing of information between the two is possible and likely can happen.
Search News Straight To Your Inbox
*Required
Join thousands of marketers to get the best search news in under 5 minutes. Get resources, tips and more with The Splash newsletter: