robots.txt for AI crawlers — a practical guide

Allow or block ChatGPT, Claude, Perplexity, Gemini, and other AI bots from your site.

Updated 5/4/2026

Try the free tool: robots.txt Generator →

What is robots.txt?

robots.txt is a plain-text file at your site root that tells crawlers what they can and can't access. Each rule names a User-agent and either Allows or Disallows paths.

AI vs search crawlers

AI platforms ship their own crawlers, separate from Googlebot or Bingbot. The big ones today:

OAI-SearchBot — ChatGPT search
ChatGPT-User — ChatGPT browsing mode
GPTBot — OpenAI training data
PerplexityBot — Perplexity AI
ClaudeBot — Anthropic Claude
Google-Extended — Gemini training data
Applebot-Extended — Apple AI / Siri
Bytespider — ByteDance / TikTok
Meta-ExternalAgent — Meta AI

If you want to appear in AI answers, you generally want these allowed.

Common mistakes

Blocking everything by default. A blanket Disallow: / for AI bots silently removes you from AI search.
Forgetting Google-Extended. This controls Gemini's training — separate from Googlebot.
Putting robots.txt in the wrong place. It must live at the root: https://yoursite.com/robots.txt.

Use the free generator

Our robots.txt Generator gives you per-bot toggles for every major AI and search crawler, lets you append a sitemap URL and custom rules, and outputs a valid file.

Was this page helpful?

What is robots.txt?

AI vs search crawlers

Common mistakes

Use the free generator

Keep exploring