Robots.txt Generator
Create a perfect robots.txt file in seconds. Visual builder with one-click presets for blocking AI bots, search engines, and custom crawlers. Everything runs locally in your browser.
Presets
Sitemap URL
User-Agent Rules
What is robots.txt?
A robots.txt file is a plain text file placed at the root of your website that instructs search engine crawlers and bots which URLs they are allowed to access. It follows the Robots Exclusion Protocol, a standard respected by all major search engines including Google, Bing, Yahoo, and DuckDuckGo.
Every website should have a robots.txt file. Without one, crawlers will attempt to access every URL on your site, which can waste your server's bandwidth and lead to private pages being indexed.
How to Block AI Bots with robots.txt
With the rise of AI models like ChatGPT, Claude, and Gemini, many website owners want to prevent their content from being used for AI training. Here are the user-agents you need to block:
| Bot | Company | User-Agent | Purpose |
|---|---|---|---|
| GPTBot | OpenAI | GPTBot | Training data collection |
| ChatGPT Browser | OpenAI | ChatGPT-User | Live web browsing |
| ClaudeBot | Anthropic | ClaudeBot | Training data collection |
| Google AI | Google-Extended | Gemini training | |
| Common Crawl | Common Crawl | CCBot | Open dataset |
| ByteSpider | ByteDance/TikTok | Bytespider | Training data |
| Meta AI | Meta | FacebookBot | AI training |
Use our "Block AI Bots" preset above to generate the correct robots.txt rules instantly.
robots.txt Syntax Guide
The robots.txt format uses a simple syntax with these key directives:
User-agent: *— Applies the following rules to all bots. Replace*with a specific bot name to target it.Disallow: /path/— Blocks bots from accessing the specified path and its subpages.Allow: /path/— Explicitly allows access to a path (useful as exception within a broader Disallow).Sitemap: URL— Points crawlers to your XML sitemap for better discovery.Crawl-delay: N— Requests bots wait N seconds between requests (not supported by Googlebot).
Frequently Asked Questions
Does robots.txt actually stop crawlers?
Robots.txt is a voluntary protocol. Legitimate bots from Google, Bing, OpenAI, and Anthropic will respect it. Malicious scrapers and bots may ignore it. For truly sensitive content, use server-side access controls like authentication or IP-based blocking instead.
Where do I place my robots.txt file?
Upload it to the root directory of your website so it's accessible at https://yourdomain.com/robots.txt. It must be at this exact URL — files in subdirectories like /blog/robots.txt will be ignored by crawlers.
Can robots.txt hurt my SEO?
Yes, if misconfigured. Blocking Googlebot from important pages will prevent them from appearing in search results. Never disallow your homepage, key content pages, CSS files, or JavaScript that Google needs to render your pages. Always test with Google's robots.txt tester.
What is the difference between robots.txt and meta robots tags?
Robots.txt controls crawler access — whether a bot can fetch a page at all. Meta robots tags (<meta name="robots" content="noindex">) control indexing — whether a fetched page appears in search results. For best results, use both together.
Should I block /wp-admin/ in WordPress?
Yes. Add Disallow: /wp-admin/ but keep Allow: /wp-admin/admin-ajax.php since some themes and plugins need AJAX to function properly. Our WordPress preset handles this automatically.
How often do search engines check robots.txt?
Google typically caches your robots.txt for up to 24 hours. Bing may cache it longer. After making changes, you can request a re-crawl via Google Search Console. For urgent changes, use the robots.txt tester in Search Console to verify your new file.
Built your robots.txt? Now check your full AI readiness.
See how AI bots interact with your site — robots.txt rules, llms.txt, structured data, and more.
Check AI Readiness →