Google has introduced a brand new Googlebot, a brand new Google crawler, named Google-Prolonged that you need to use to regulate in case your content material might help enhance Bard and Vertex AI generative APIs or future Google AI merchandise. So if you wish to disallow Bard from utilizing your content material, you specify so in your robots.txt with the person agent Google-Prolonged.
Google will not crawl from Google-Prolonged, Google will nonetheless crawl from its regular Googlebot or different bots. However utilizing Google-Prolonged will talk to Google to not use that content material for Bard or different AI Google tasks. A Google spokesperson informed me, “Google-Prolonged will inform Google to not use the positioning’s content material for Bard and Vertex AI generative APIs.” “For Search, web site directors ought to proceed to make use of the Googlebot person agent by way of robots.txt and the NOINDEX meta tag to handle their content material in search outcomes, together with experiments like Search Generative Expertise,” Google added.
Basically this lets you permit Google Search to crawl, index and rank your web site however disallow Bard or different Google AI tasks from utilizing your content material.
This comes after Bing provided controls to dam Bing Chat AI from utilizing your website every week in the past.
“Right this moment we’re asserting Google-Prolonged, a brand new management that net publishers can use to handle whether or not their websites assist enhance Bard and Vertex AI generative APIs, together with future generations of fashions that energy these merchandise. By utilizing Google-Prolonged to regulate entry to content material on a website, a web site administrator can select whether or not to assist these AI fashions turn out to be extra correct and succesful over time,” Google wrote.
Google-Prolonged is a “standalone product token that net publishers can use to handle whether or not their websites assist enhance Bard and Vertex AI generative APIs, together with future generations of fashions that energy these merchandise,” Google defined.
The Person agent token is Google-Prolonged
“Google-Prolonged does not have a separate HTTP request person agent string. Crawling is completed with present Google person agent strings; the robots.txt user-agent token is utilized in a management capability,” Google added.
I’m not certain if that is the different strategy for robots.txt for AI…
Large information on the AI entrance. You’ll be able to implement through robots.txt -> Asserting Google-Prolonged, a brand new management that net publishers can use to handle whether or not their websites assist enhance Bard & Vertex AI generative APIs, together with future generations of fashions https://t.co/L73rm6mwzM pic.twitter.com/BtcQ5kaATP
— Glenn Gabe (@glenngabe) September 28, 2023
Be aware, Google Information bot additionally works an analogous approach, the place it doesn’t crawl however makes use of the directive for utilizing that content material in Google Information:
Sure, like Googlebot-Information.
— John, aka “a complete bell cheese” (@JohnMu) September 29, 2023
Discussion board dialogue at X.