Free Tool

Free Robots.txt Checker
Test Your Robots.txt Rules & Find Blocked Pages

Validate your robots.txt file, test URL access rules, and discover pages accidentally blocked from search engines. A single bad rule can tank your SEO — make sure yours is correct.

We'll fetch and analyze your robots.txt file for errors, warnings, and best practices.

This tool is coming soon. For now, try our Index Status Checker which reads your robots.txt as part of the sitemap discovery process.

Check if this URL is allowed or blocked for Googlebot, Bingbot, and other crawlers.

What the Robots.txt Checker Validates

Parse & Validate Robots.txt

Fetch and parse your robots.txt file. Check for syntax errors, malformed directives, and conflicting rules that could confuse search engine crawlers.

Test URL Access Rules

Enter any URL and instantly see whether it's allowed or blocked by your robots.txt rules. Test against different user-agents (Googlebot, Bingbot, etc.).

Find Accidentally Blocked Pages

Discover important pages that are accidentally blocked from crawling. A single misplaced Disallow rule can prevent Google from indexing your best content.

Check Sitemap Declaration

Verify your robots.txt includes a Sitemap directive pointing to your XML sitemap. This helps search engines discover all your important pages.

Common Robots.txt Rules & Their Impact

Disallow: /

Blocks ALL pages from crawling. Accidentally using this on a live site is catastrophic for SEO.

Disallow: /wp-admin/

Correct — blocks admin pages. But make sure /wp-admin/admin-ajax.php is allowed if your site uses AJAX.

Disallow: /*?

Blocks all URLs with query parameters. This can accidentally block faceted navigation, search results, and paginated content.

No Sitemap: directive

Without a Sitemap directive, search engines may not find your sitemap unless it's submitted via Search Console.

Disallow: /cdn-cgi/

Cloudflare-specific rule. Generally fine, but verify no important pages live under this path.

Disallow: /search

Blocks internal search pages. Usually correct — you don't want search result pages indexed.

Robots.txt Best Practices

Don't block CSS and JavaScript files
Include a Sitemap directive
Use specific paths, not wildcards
Don't use robots.txt to hide sensitive pages
Test changes before deploying
Keep it simple and readable

Related Tools

Frequently Asked Questions

What is a robots.txt file?

A robots.txt file is a text file at the root of your website (e.g., example.com/robots.txt) that tells search engine crawlers which pages they can and cannot crawl. It's the first file crawlers check when they visit your site.

Can robots.txt prevent pages from being indexed?

Robots.txt blocks crawling, not indexing. If a page is blocked by robots.txt, Google can't crawl it — but if other pages link to it, Google might still index the URL without any content. To prevent indexing, use a noindex meta tag instead.

What happens if I don't have a robots.txt file?

If there's no robots.txt file, search engines assume they can crawl everything on your site. This is fine for most sites, but adding a robots.txt file with a Sitemap directive helps search engines discover your sitemap faster.

How do I fix a robots.txt error?

Edit the robots.txt file on your server (usually in the root directory of your website). Common fixes: remove overly broad Disallow rules, add missing Allow directives for important resources, and include a Sitemap directive. Test changes before deploying.

Should I block CSS and JavaScript files in robots.txt?

No. Google needs to crawl your CSS and JavaScript to render your pages correctly. Blocking these resources can hurt your SEO because Google can't see how your page actually looks and functions.

Check your robots.txt and index status together

IndexLens reads your robots.txt during sitemap discovery and shows which pages are blocked from crawling. Combine with index checking for a complete SEO health picture. Free for 50 checks/day.

Try the Index Checker