By using this service you agree to our User Terms and Privacy Policy. I agree

Basic HTML concepts

Links and query string

The XmlSitemapGenerator will follow all standard links such as the below example

<a href="/mylink/page.html">this is a link</a>

You can tell the spider to only follow links with a certain file extension such as .htm although normally you would want to inlcude all extensions

<a href="/mylink/page.html">this is a link</a>

Some urls such as the below will have no file extension. Again, normally you would want the spider to follow all urls.

<a href="/help/">this is a link</a>

It will also follow any query strings such as

<a href="/mylink/page.html?pageid=121">this is a link</a>

Some websitws include a unique ID or session ID for each user in the query string. If your website site does this you should make sure this parameter is added to the ignore list. We have added some of the common ones to the default values.

<a href="/mylink/page.html?SessionId=8765434567&pageid=121">this is a link</a>

The same rules are applied to all links and urls no matter where they are, for example in image maps and framesets.

Image maps

When you add an image to a webpage you can add a hotspot. The spider will follow these links by default.

<img src="planets.gif" width="145" height="126" alt="Planets" usemap="#planetmap">
<map name="planetmap">
<area shape="rect" coords="0,0,82,126" href="sun.htm" alt="Sun">
<area shape="circle" coords="90,58,3" href="mercury.htm" alt="Mercury">

As with other HTML elements all frame formats will be spidered by default.

Framesets and iFrames

Framesets allow you to bring together a number of separate pages displayed as one.

<frameset cols="25%,*">
<frame src="frame_a.htm">
<frame src="frame_b.htm">

A similar concept is the iFrame that allows you to embed another page within a page.

<iframe src="/test/myframe.htm"></iframe>

As with other HTML elements all frame formats will be spidered by default.


The XmlSitemapGenerator will optionally include images in your sitemap. You can include all images or select them based on whether or not their alt and/or title tag is populate.

<img src="/images/text.gif" title="My title here" alt="My alt caption here" />

As well as the alt tags you can also specify based in the image type / file extension in the same way that you can for urls.

<img src="/images/text.gif" title="My title here" alt="My alt caption here" />

Sharing is caring

Please support us by sharing...

Need more pages or spiders?

You can now spider more pages, get more spider sessions, external url validation and more benefits by making a small contribution.

find out more