What Is A Sitemap and Why Is It Important?
A sitemap is a list of the indexable pages on a website. The most common type of sitemap that is typically referred to in SEO is formatted for search engines to help their web crawlers find all URLs on a domain. HTML sitemaps can also exist as pages on a site and are typically aimed at assisting human users by providing a listing of pages to access all from one location.
Sitemaps are one of the most foundational items of a healthy website but unless you put it together and uploaded it to your server directly, you may not be sure if you have one and if so where it’s located. While sitemaps aren’t required, they can help ensure that Google is discovering all of your URLs. They are especially helpful if you have a poor site hierarchy and/or internal linking.
What Are The Sitemap Formats?
The primary authority in charge of governing the standard for sitemaps is sitemaps.org. There are various types of sitemap formats supported by Google including:
- RSS, mRSS, and Atom 1.0
- Google Sites
Additionally there are many extensions available to help provide additional information about the content contained within your sitemap file. These include special protocols for:
In addition to URL sitemaps, you can also create a sitemap index file which is essentially a sitemap of sitemaps. By including links to other sitemap files in an index you can organize your URLs hierarchically by site section or include more than the limit of 50,000 URLs per sitemap required by Google.
Why Would You Want to Find a Sitemap?
Some of the common reasons you might want to find a rogue sitemap include:
- Find old sitemaps on your domain that may be outdated
- Get a list of all pages on a website
- Leverage for competitive analysis (see how competitors are structuring their sitemap indexes or site directories)
- Find your sitemap URL to submit to web crawlers (especially if your sitemap is auto-generated by a CMS)
How to Find Your Sitemap
1. Check Common Locations
The sitemap .xml file is typically located in the root directory of your domain (ex: https://www.websitedomain.com/sitemap.xml). The filename can be anything defined by the webmaster and the file can live anywhere that is publicly accessible on the website’s domain. They can be placed in a sub-folder which is sometimes done to hide a sitemap from competitors seeking an easy way to discover all of the URLs on the domain.
If this is your domain, you can access your website file directory through FTP to look where the sitemap XML file might be located. If you don’t have access to your site’s files directly, you can try typing some common naming conventions for sitemaps into your browser to see if there are active files you can access. For example:
2. Check Robots.txt
All sites should have a robots.txt to provide directives to web crawlers and bots and this file typically includes a link to the sitemap to help search engines locate the sitemap file quickly to start crawling. The standard file location for robots.txt is directly under the main site directory, ie: https://www.websitedomain.com/robots.txt
You can try this for any domain and if a sitemap is declared, you will see a line entry such as:
- Sitemap: https://www.websitedomain.com/sitemap.xml
Check out our guide if you are unfamiliar with how to read a Robots.txt file.
3. Use Advanced Search Operators
There are a number of advanced search operators that can help refine a search in Google. If there is a sitemap on a domain that isn’t in the standard location or declared in the robots.txt this is your best bet for locating the sitemaps that Google has discovered and indexed.
There are two ways to search a domain for xml sitemaps, both utilizing site: domain searches looking for identified XML file types. Try typing one of these into Google to see if any results are returned:
If this returns many pages of files that are irrelevant you can further qualify the search by adding:
site:websitedomain.com filetype:xml inurl:sitemap
site:websitedomain.com ext:xml inurl:sitemap
This will look for XML files on your domain with the word “sitemap” included in the filename or sub-folder directory.
4. Use a Tool (ex: Google Search Console)
If you own the website in question (perhaps it’s a new client or you’re new to the team), and there’s already a Google Search Console property setup for the site, login to see if there is a sitemap declared that Google is already crawling.
Additionally, there are other tools that crawl the web and offer tools for checking domains for sitemaps. SEO Site Checkup has a sitemap tool that is very easy to use, just type in your domain and it will let you know if it has found a sitemap file (note: this is not guaranteed).
5. Check Your CMS
If you are utilizing a common CMS, it’s possible they are generating a sitemap for you automatically. Check the documentation depending on your CMS to see if there are any details on sitemaps. We’ve included some information about some of the most common CMS’s below:
How to Find your Sitemap on Wordpress
Plugins are not installed by default so this would not be relevant for new websites, only existing sites or sites that were setup by a developer that may have installed them for additional functionality. To find what plugins are installed on your Wordpress site, you’ll want to click the “Plugins” link on the left nav in the admin section of Wordpress. If “Yoast SEO” is an active plugin, you’ll have an “SEO” link in the left sidebar and you’ll find Sitemaps under the “XML Sitemaps” sublink.
Unless these plugins are specifically configured to place sitemaps in specific directories you will typically find them in the standard locations/filenames covered in tip #1.
In addition to plugins, you can also check any custom theme settings, “Tools”, and “Settings” and look for any XML or Sitemap settings. When in doubt, check your documentation.
How to Find your Sitemap on Squarespace, Shopify or Wix
All three of these platforms automatically generates XML sitemaps and places it as a sitemap.xml file on your domain. You should be able to find your sitemap by appending /sitemap.xml at the end of your domain in your browser.
While these services do not allow you to directly alter the sitemap files but they do update automatically with new URLs when they are created unless you explicitly specify that they should be excluded at the page level.
What Do I Do Next With My Sitemap?
Once you’ve identified or created your sitemap, you’ll want to verify to make sure that the file is valid. You can utilize a tool such as this validator from XML-Sitemaps.com. If you have a valid sitemap file, follow these steps for the most impact to your SEO:
- Make sure your sitemap is up-to-date and URLs are valid: In addition to a valid file format, your sitemap should only contain accurate URLs. Any old or incorrect sitemaps should be removed to avoid confusion for bots.
- Add a sitemap declaration to your robots.txt file (optional): This is a good idea for other search engines that aren’t Google or Bing but if you are sensitive about competitors obtaining valuable information through your sitemap, this is not required.
- Submit your sitemap to Google through Google Search Console: GSC provides valuable insights into how Google is processing your sitemap file.
- Submit your sitemap to Bing through Bing Webmaster Tools: Bing also provides a tool for sitemap submission.
What If I Can’t Find My Sitemap?
If you’ve tried all of these methods, it’s possible that you do not have a current sitemap. To create a sitemap, you can leverage a plugin on your CMS, create one manually, or use a crawler such as Screaming Frog to crawl your site and export a sitemap file for upload.
For more information on sitemaps, check out Google’s Webmaster Guidelines for Sitemaps.
Learn More: How to Create and Submit an XML Sitemap