Free Robots.txt Tester & Validator

Test any website's robots.txt file to see which pages search engines can and cannot crawl. Debug your SEO by verifying your robots directives are working correctly.

Last updated: May 2026

Robots.txt Validator

Enter a domain name (without http://)

Testpathhint

Useragenthint

Whatwetesttitle

  • All User-agent groups
  • Allow & Disallow rules
  • Sitemap declarations
  • Crawl-delay directives
  • Google longest-match precedence

Robots.txt Analysis

Placeholdertitle

Placeholderhint

Want to improve your website's SEO?

Use UseClick to create short links that help drive traffic to your website. Track clicks, analyze performance, and optimize your marketing campaigns.

What Is a Robots.txt File?

A robots.txt file is a text file placed in the root of your website that tells search engine crawlers which pages they can and cannot access. It's a crucial part of technical SEO that helps you control what content gets indexed. Our tester lets you check any website's robots.txt to understand their crawling rules.

About the Robots.txt Tester

The Robots.txt Tester helps you analyze and optimize your website.

1. Controls Crawl Budget

Search engines allocate a finite crawl budget to every domain. Disallowing low-value URLs (faceted navigation, internal search, duplicate parameters) frees Googlebot to spend that budget on the pages that actually rank and convert.

Up to 40% of crawl budget is wasted on duplicate URLs (Botify, 2024)

2. Prevents Indexing of Private Pages

Admin panels, staging environments, checkout funnels, and member-only areas should never appear in search results. Combined with noindex tags, robots.txt is the first line of defense against accidental indexing of sensitive paths.

7% of sites unintentionally expose private URLs in search

3. Must Be at the Domain Root

robots.txt only works when served from the exact path /robots.txt at the root of the host. A file at /folder/robots.txt is completely ignored by crawlers. Each subdomain needs its own file, and HTTPS and HTTP versions are treated separately.

Must return HTTP 200 from /robots.txt at the root host

Common robots.txt Mistakes

Real-world errors we see every week. Test your file with the robots txt checker above to catch these before they hurt rankings.

Avoid These Critical Errors

Disallow: /
Blocks your entire site from every crawler. Usually leaked from staging to production.
Blocking CSS and JS
Disallowing /assets/ or /static/ prevents Google from rendering pages, which damages rankings.
Wildcard Misuse
Disallow: *.pdf does nothing. Use Disallow: /*.pdf$ to block PDFs site-wide.
Wrong File Path
Placed at /robots/robots.txt instead of /robots.txt. Crawlers will never find it.
Blocking + Noindex Combo
Disallowing a page prevents Google from seeing its noindex tag, keeping it indexed.
Case-Sensitive Paths
Disallow: /Admin does not block /admin. Paths are case-sensitive.

robots.txt Syntax Reference

Every directive you need, with working examples you can copy into your own robots.txt today.

1

User-agent

Declares which crawler the following rules apply to. Use * for all bots, or name a specific product token like Googlebot, Bingbot, GPTBot, or ClaudeBot. Multiple User-agent lines can stack to share one rule block.

User-agent: * User-agent: Googlebot Disallow: /private/
2

Disallow

Tells the matched crawler not to fetch any URL beginning with the given path. An empty Disallow means allow everything. Use $ to anchor the end of the URL and * for wildcards.

Disallow: /admin/ Disallow: /*.pdf$ Disallow: /search?
3

Allow

Overrides a broader Disallow rule for specific paths. The most specific (longest) pattern wins, and when Allow and Disallow tie in length, Allow takes precedence per Google’s spec.

User-agent: * Disallow: /private/ Allow: /private/public-page.html
4

Sitemap

Points crawlers to your XML sitemap. Sitemap is a top-level directive (not bound to any User-agent group) and you can list multiple sitemaps. Always use the absolute URL.

Sitemap: https://example.com/sitemap.xml Sitemap: https://example.com/sitemap-images.xml
5

Crawl-delay

Requests a minimum number of seconds between successive crawler requests. Bing, Yandex, and Yahoo respect this directive; Google does not (use the crawl rate setting in Search Console instead).

User-agent: Bingbot Crawl-delay: 10
6

# Comments

Any text after a # on a line is treated as a comment and ignored by crawlers. Use comments to document why a rule exists so future maintainers do not delete it accidentally.

# Block AI crawlers from training pages User-agent: GPTBot Disallow: / # full site block

Frequently Asked Questions

The Robots.txt Tester is a free tool that helps you analyze and debug various aspects of your website.

Yes, this tool is completely free with no signup required.

Simply enter your URL and our tool will analyze it and show you the results.

Track Your Links with UseClick

Create a free UseClick account to shorten, brand, and track every link you share.

Crawler-Friendly

Short links honor your site's robots and SEO rules

Branded Domains

Use your domain instead of bit.ly

Full API Access

Automate links on every plan, including free

Get Started Free
Get Started FreeNo credit card requiredLast updated: May 2026

Ready to track smarter?

UseClick.io makes link management effortless. Create branded short links that are clean, memorable, and built to strengthen your brand identity.