Saving sitemap configuration and advanced settings are available to our contributors. Become a contributor.
This is where we'll start to spider your website. It's a good idea to make sure you choose the correct version of your link. For example with or without "www" and HTTP vs HTTPS. This will ensure your sitemap matches your Canonical URLs and avoid confusing search engines.
Sitemaps can include a modified date. You can set a fixed date or use the servers file modified date. No date of the server modified date are usually the best option. Manually setting a date can confuse search engines about when your pages were actually updated.
While you can include a change frequency, unless there is a single value which represents all of your sites pages accurately this may be best left as "None" to avoid confusing search engines. Especially if the update frequency of pages varies considerably.
This setting indicates the relative priority of pages within your sitemap. Given that it is a general setting for all of your sitemap it often makes most sense to set it to "none"
RSS Sitemaps include the page title. You can select which HTML page element we will select this from.
We will process all link extensions and attempt to detect the content type. If we detect text/HTML we will include them. We recommend you leave this settings checked in most cases, otherwise you will have to specify extensions manually.
Some pages don't have a file extension. We will attempt to detect the content type and include them if they are text/HTML. We recommend you leave this settings checked in most cases.
This allows you to configure the extensions you want to include. This allows you to include none HTML links such as PDF and word documents. Note that if you do not check the "Include all file extensions" you should list all your HTML extensions here too.
Sometimes you may only want to follow the URL without any query strings (The bit after "?"). You can uncheck this option to just list the page without the query string.
If you are following query stings, sometimes you may want to exclude certain parameters such as those used for session tracking. You can list these here.
We can check your canonical url and redirect accordingly however you should be careful to ensure your start url is correct and all your internal links are using the same host / schema or are relative to avoid urls being dropped.
If you use the no index meta tag the spider can obey this and ignore pages where it is set.
You can filter out pages using regular expressions. You should enter one regular expression per line.
You can choose to include images based on a number of different criteria.
You can specify the types of images to include by using their filename extensions.
You can filter out images using regular expressions. You should enter one regular expression per line.
You can control the crawl rate of our spider and how many pages it will spider concurrently. Use this setting carefully as it could overload your webserver or appear like a denial of service attack and get blocked.
We provide our services for free and operate on the goodwill of our community, but it has become increasingly difficult maintain and fund. We kindly ask that if the service was useful and saved you some time that you make a small contribution.
I'd like to help