So… You want to add some of the information to block AI crawlers from continually scraping your data. Easy enough with a WordPress plugin, and Squarespace has a toggle, but what about Shopify?
The help center document on their robots.txt file seems very daunting. Especially with that big ol warning about losing all of your traffic. But I assure you, it’s actually very easy to do, and very easy to revert if you’ve done it incorrectly.
The Big Warning
Yes, messing with the robots.txt file can lose all of your non-direct traffic. If you accidentally disallow all bots from your site, your website can become deindexed by search engines, resulting in lost traffic and degraded performance on search. But you really do need to royally mess around and mess up for that to happen. Checking you’ve done it right and haven’t accidentally blocked Google entirely is very easy when you’re done. Let’s get started, shall we?
Editing your robots.txt file
Shopify’s help says the following:
You can use Liquid to add or remove directives from the
robots.txt.liquidtemplate. This method preserves Shopify’s ability to keep the file updated automatically in the future, and is recommended. For a full guide on editing this file, refer to Shopify’s Developer page Customize robots.txt.liquid.
This is a great idea, so we’re going to use this method so we don’t mess with what Shopify may add/remove.
These steps are easy and are detailed in Shopify’s help document:
- From your Shopify admin, click Settings > Apps and sales channels.
- From the Apps and sales channels page, click Online store.
- Click Open sales channel.
- Click Themes.
- Click the … button, and then click Edit Code.
- Click Add a new template, and then select robots.
- Click Create template.
- Make the changes that you want to make to the default template.
Here’s where you might get hung up. What changes should you add, especially to make sure it’s still working properly after?
Very easy, actually. You can add a custom rule. In your robots.txt, you can manually enter the rule you want to add. Simply add it after the Liquid default rules, at the very end of the file. For instance, this rule will block ChatGPT from all pages on your site:
User-agent: GPTBot
Disallow: /
To block all currently known AI crawlers from your site, it’s going to look more like this:
User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: FacebookBot
Disallow: /
User-agent: GoogleOther
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: OmigiliBot
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: anthropic-ai
Disallow: /
Go ahead and paste that in, then save it.
Testing your Robots.txt
It’s always a great idea to check that you did it right! There’s multiple robots.txt checkers out there you can use, like this one, and it’s very easy to use. Gather up your url and place it in the URL spot. Click the user agent drop down and select the user agent you’d like to pretend to be (start with one you blocked, like GPTBot, and then test GoogleBot to be sure you’re disallowing one, and allowing the other). Then, click “live”, then click the red TEST button. Your results should show in the lower right corner. Green “allowed” if the bot can go to your site, or red “disallowed” (with the rule highlighted) if the bot can’t.
Messed up somewhere along the way? Just go back and delete everything you added after the Liquid data to revert it back to the way it was.
That’s it. Simple, right?
If you have questions, let me know in the comments below!
