Should you block Google in robots.txt?

Should you block Google in robots.txt?

Google expanded is a control mechanism that you can use in your file of your robots.txt to say Google that you cannot use your content for training the future Gemini models. It also prevents Google from using its content for the earth of conversations in Gemini. However, it is important to know that A Google-based inadequate in its robots.txt does not tense Google to use its content in AI overviews.

To block Google Extended, you can add something like this to your Robots.txt file:

user-agent: Google-Extended
Disallow: /

This will say Google that she should not use your pages for your Gemini models in training or to use in the grounding of Gemini.

Or,

user-agent: Google-Extended
Disallow: /directory1


This will say Google that it does not use a specific list for the training of his models or to earth.

Should you block Google in robots.txt?

Will the use of Google used in robots.txt prevent my content from being used in AI overviews?

Surprisingly, the answer is no. AI overviews are seen as part of the most important Google search experience. Google-like will not Keep Google to use your website in AI overviews.

You may Block your website or parts by using them in AI overviews by using them “Nosnippet” meta day. However, It is important to know that this content is also used by the search.

Will the use of Google further harm my presence in the search?

Google says“Google does not affect the inclusion of a website in Google search, nor is it used as a ranking signal in Google search.” You simply prevent your website from being used for the future training of Gemini models and for the basis of the Gemini app and for the apps developed by Vertexai that use grounding.

What is grounding?

When a user asked the Gemini app, the model sometimes pulls pages from the search and reads passages to either check or enrich his answer. Then these pages are displayed as references within the app.

If you use the Google-based robot block, you will not be displayed as links that are recommended in the Gemini app. For example, if search engine country (#3 below) had used the Google-expanded token, Gemini would not have recommended it as a source:

Google expanded them as a reference for grounding Gemini

Earth with the search can also be used by developers Build the AI ​​products with Google’s Vertexai. The use of Google’s use prevents it from diving in Vertex-AI-developed apps that are used with the search after grounding.

Do you remove existing content from Google from the training of Gemini?

No. If Gemini has already trained on her content, it is already integrated into its parameters. The block excludes your pages from new Gemini training runs so that you can set the new use of your content. However, it cannot know that knowledge is already embedded in current Gemini versions.

Does Google expand that your content is recommended in the new AI mode?

NO. AI mode Is a search-Labs experiment driven by Gemini. It is a search product, so it follows the same search preview control elements as AI overviews -not Google -Exited. If you really want to keep your text out of AI mode (and from a search section), add a Or set Max-Snippet: 0. Google has only ruled the training and grounding of Gemini. It does not affect how search factors your pages appear. Therefore, it should have no influence on your inclusion in answers in AI mode.

Should you block Google in robots.txt?

Which websites should Google extension use?

Use the Google -based guideline if you do not want your publicly crawling pages to be fed into twins -either for future training or for the ground step, the Gemini chat quotes, but you want to continue indexing Googlebot and operating your content in the search. Here too, AI overviews are part of the search, so Google expansion does not prevent Google from using snippets of their content in a AI overview.

There are some situations in which I think Google expansion makes sense. For example if You have licensed, paid or premium content Then the search can show a short cutout and you can pay the rest. You do not want this content to be used for the training of AI or to ground so that you can block it with Google Exited.

Another reason for the use of Google Exited could be if your IP is in your words. For example, if you sell essays, fiction or paywalled research, you may not want AI to train on it.

Who is currently using Google Extended?

Many large news publishers use it. According to Reuters, 24% of the most frequently used news websites were until the end of 2023 Blocking Google’s AI Crawler. Including:

I checked the Robots.txt of several large, well -known websites by random.

Google-Extended use these websites in their robots.txt and block Google to use their content for training Ki or to form Gemini answers:

None of them have expanded Google to Robots.txt:

Should you block Google in robots.txt?

Thoughts by Marie

I know many site owners, whose first instinct is: “I don’t want Google to use my content to train your AI! I will block you!” I definitely understand this reaction. However, if the livelihood of your website depends on being found and blocking Google Extended May Make more harm than beneficial.

More and more people will use Gemini as a chat interface to get answers. Google Extended will prevent your website from being used as a source in Deep Research. Google Assistant in telephones, wearables, televisions and cars will be Upgrade for the use of Gemini. I personally believe that for many people Google Assistant becomes the primary way you are looking for As AI, our daily assistant becomes. You will create it so that these sources cannot quote you.

I think it is a difficult call for some websites, whether Google’s ability to train, block your content or not. In some cases, it could be disadvantageous if your value is in your unique words and insights.

However, take a business like mine. I wrote this article that you now read because I asked customers questions that were difficult for me to answer. I wrote thoughts about it How agents will change the webPresent How the AI ​​mode worksand the Important changes in the quality rater guidelines from Google. I am thrilled for me when I see my work from Gemini or Chatgpt. It gives me more awareness, emphasizes my specialist knowledge in the world and may bring me new Newsletter subscribers And Customers.

Marie von Gemini referred

In short, I Monetarize the trust that my letter createsNot writing itself – so I would rather keep my content open for the use of AI.

Where it becomes difficult, when we see Google’s AI products, cite the content that comes on your website. It is important to note that the use of Google Expeded does not prevent you from using your content in AI overviews or AI mode answers. They are not blocked by snippets – part of the search, by using Google extension.

I really think that it makes more sense for most websites to enable Google’s AI to train and use its information for the ground than to block with Google Extended.

I’m still not sure if you should use a Google-further block in Robots.txt? Try this GPT

Should I use the Google-based robot token to prevent Google from using my content for training AI and grounding? – Created a GPT by Marie Haynes

I have made a number of research on a series of websites for this article. I have included this research into the knowledge base of a GPT. If you are not sure whether you should use a Google-based Robots.txt block, this should really help!

Google-based GPT

Here is a sample discussion:

Google Extended diagram

What Google has expanded

Google Extended Decision

I hope that helps!

If you liked that, you will love my newsletter …

Marie

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top