sitemap.xml best practices
Sitemap rules every site owner should follow — plus a free verifier.
Updated 5/4/2026
Try the free tool: Sitemap Verifier →
What is a sitemap?
A sitemap.xml is an XML file listing the URLs you want crawlers to discover. Every major search engine and most AI crawlers use it as a starting point.
Format and limits
- Maximum 50,000 URLs per file
- Maximum 50MB uncompressed
- For larger sites, use a sitemap index that points to multiple sitemap files
- URLs should be absolute and under 2,048 characters
- Avoid duplicates — they waste crawl budget
Use lastmod properly
The element tells crawlers when a page changed. Setting it accurately helps bots prioritize fresh content. Only update it when the content actually changes — bumping it on every deploy makes it useless.
Verify your sitemap
Our Sitemap Verifier fetches your sitemap, parses it, and reports common issues: malformed URLs, duplicates, missing lastmod, oversize files, and more. Free, no signup.
Was this page helpful?