How to Block PerplexityBot in robots.txt
Perplexity AI has become one of the fastest-growing AI search engines, and its crawler, PerplexityBot, has shown up in almost every access log by now. If you want your content out of Perplexity's answer engine, here's exactly how to do it, what the two Perplexity user-agents mean, and the one trade-off most guides skip.
The Two Perplexity User-Agents
Perplexity runs two crawlers, and the difference matters. Blocking only one of them is a common mistake.
- PerplexityBot (user-agent:
PerplexityBot) — The traditional crawler. It discovers and indexes pages so Perplexity can cite them in answers. Blocking this stops your content from entering Perplexity's index. - Perplexity-User (user-agent:
Perplexity-User) — Fetches pages in real time when a user's question requires fresh content or when a user clicks a citation. Blocking this stops Perplexity from loading your page during a live query.
If you only block PerplexityBot, Perplexity can still pull your page on demand through Perplexity-User. If you want a complete block, you need both.
The robots.txt Rules
Drop this into the robots.txt at the root of your domain.
Block PerplexityBot completely
User-agent: PerplexityBot
Disallow: /
Block both Perplexity crawlers
User-agent: PerplexityBot
Disallow: /
User-agent: Perplexity-User
Disallow: /
Block Perplexity from specific sections only
Keep your public docs in Perplexity's answers, but keep premium or internal content out:
User-agent: PerplexityBot
Disallow: /premium/
Disallow: /members/
Disallow: /drafts/
Allow: /docs/
Allow: /blog/
Complete robots.txt example
# Allow search engines
User-agent: Googlebot
Allow: /
User-agent: Bingbot
Allow: /
# Block Perplexity (both user-agents)
User-agent: PerplexityBot
Disallow: /
User-agent: Perplexity-User
Disallow: /
# Default: allow everything else
User-agent: *
Allow: /
Sitemap: https://yourdomain.com/sitemap.xml
Where the File Has to Live
robots.txt must be at the root of your domain: https://yourdomain.com/robots.txt. Not in a subfolder, not under a different name.
- Nginx/Apache — Drop the file into your web root (for example
/var/www/html/). - WordPress — Use Yoast SEO, Rank Math, or create
robots.txtin the WordPress root directory directly. - Next.js/Vercel — Place
robots.txtin thepublic/directory, or export it throughapp/robots.ts. - Cloudflare Pages/Netlify — Include
robots.txtin your build output directory (public/ordist/). - Shopify — Edit
robots.txt.liquidin your theme. Shopify exposes it as an editable template.
Verify the Block Actually Works
- Open
https://yourdomain.com/robots.txtin a browser and confirm the rules are visible. - Run the AI Readiness Checker. It scans for PerplexityBot, PerplexityBot-User, GPTBot, ClaudeBot, and 8 other AI crawlers and tells you which are blocked.
- Check your access logs for requests matching
PerplexityBotorPerplexity-User. After the block, you should see them hit/robots.txtand stop there.
Scan your site for AI crawler rules
The AI Readiness Checker tells you which AI bots are allowed, blocked, or ignored.
Run AI Readiness CheckThe Trade-Off Most Guides Skip
Perplexity is one of the few AI tools that actively sends users to source pages. Every cited answer in Perplexity links back to the publisher, and in 2026 those clicks are among the only growing AI referral sources. Blocking PerplexityBot means:
- Your content will not appear as a cited source in Perplexity answers.
- You lose any future referral clicks Perplexity might have sent.
- Over time, as Perplexity usage grows, the opportunity cost grows with it.
For pages where the goal is reach and traffic, the right move is usually to keep Perplexity allowed. For pages where the goal is revenue protection (paywalled articles, premium research, proprietary datasets), blocking makes sense.
Rule of thumb: Block Perplexity from the content you monetize directly. Allow it on everything you use to drive audience.
Things People Get Wrong
Blocking doesn't retroactively remove you
If Perplexity has already indexed your content, the block takes effect only on future crawls. Old citations can persist until the next re-crawl. There is no removal API at the time of writing.
robots.txt is voluntary
Well-behaved crawlers respect it, but nothing technically enforces it. For guaranteed blocking, combine robots.txt with firewall-level rules against Perplexity's published crawler IP ranges or server-side user-agent filtering.
Different subdomains need their own robots.txt
A rule in yourdomain.com/robots.txt does not cover blog.yourdomain.com. Each subdomain needs its own file.
Case matters less than you think
Perplexity matches user-agents case-insensitively for matching purposes, but the canonical strings are PerplexityBot and Perplexity-User. Use those exactly.
Generate your robots.txt with AI bot presets
One click to enable presets for Perplexity, GPTBot, ClaudeBot, Google-Extended and more.
Open Robots.txt GeneratorFrequently Asked Questions
What is PerplexityBot and what does it do?
PerplexityBot is the web crawler operated by Perplexity AI. It indexes web pages so they can appear as sources in Perplexity's AI-powered answer engine. When a user asks Perplexity a question, it cites pages it has crawled, and PerplexityBot is what collected them. Perplexity also runs a second user-agent called Perplexity-User for real-time fetches when a user clicks through an answer.
Does blocking PerplexityBot remove me from Perplexity answers?
Blocking PerplexityBot prevents future indexing, but content already crawled may still appear until Perplexity re-crawls. To also stop real-time fetching during user queries, you need to block Perplexity-User as well. Be aware that if you block both, you will likely lose any citation traffic from Perplexity, which has been one of the few growing referral sources for many publishers in 2026.
Does PerplexityBot actually obey robots.txt?
Perplexity states publicly that PerplexityBot respects robots.txt. There have been reports in mid-2024 alleging that requests from Perplexity ignored robots.txt in certain cases, which Perplexity attributed to third-party services. If you want certainty, combine a robots.txt block with server-side IP or user-agent filtering for Perplexity's published crawler IPs.