Overdue online update

We’ve been working on an update for some time to address some bugs and performance issues since our last release. This lead to some pretty big rabbit holes and major reworks of code and re-architecting  how we use Microsoft Azure infrastructure, which has consumed many evenings and weekends, but finally there is light at the end of the tunnel.

Most of the changes in this release were technical under the hood issues that many users wont appreciate although the service is running more smoothly and faster, not withstanding their may be teething issues as we migrate users on to the new release.

Key fixes and changes include

  • Prioritization of sitemap jobs for contributors.
  • Contributor can configure more concurrent requests for faster processing.
  • Longer timeout periods for larger sitemaps.
  • “My sitemaps” page performance improved.
  • Fixed the transient download link issues.

Please note, the more concurrent requests you have the more pressure our spider will put on your website so please use this carefully.

Technical stuff.

Most of the time went in to key technical changes which aren’t worth writing up in huge detail but for interest :

  • Re-writing of the spider HTTP stack, in particular, moving to the new .Net SocketsHttpHandler to improve the performance of requests.
  • Changing from exclusively in memory processing to streaming spider results disk, which allows for more concurrent processing and larger sitemaps without running in to memory issues.
  • Migrating the entire spider process to Azure Functions to allow spider sessions to run for longer without timing out and dropping pages.

These key changes were have led to a more performant and stable spider engine, but there was quite a learning curve in some areas which led to lots of time invested in research and bug resolution.

Once we had stabilized the new setup we were able to resolve a number of additional bugs and issues that people had reported, but were transient or hard to reproduce due to stability issues.

One issue we have not been able to resolve is the performance of external link validation. We lost a lot of time to this one and despite re-writing our HTTP stack we were unable to resolve it. After many hours we discovered that the problem relates to an issue (feature) of Microsoft Azure. Microsoft is throttling DNS requests and so no matter how performant our code their DNS server is a serious bottle neck.

We have a number of strategies we are looking at to resolve this however it may take some time and unfortunately in the meantime we have disabled this feature to prevent it impacting on other areas of the service.

We haven’t forgotten G-Mapper and Wordpress.

In better news, we will be adding a few new features and at some point hope to migrate this code over to G-Mapper so that our Windows users can benefit from these improvements. In the meantime, we’re aware that our Wordpress users have been waiting on some bug fixes for some time and so this will be our next priority.

Infrastructure upgrade

One thing that has become clear is that we need to move to beefier hosting and database plans which will ultimately cost more money, almost doubling our costs. We hope that by gaining more contributors we can do this before the end of the year.

Please support us by becoming a contributor  or supporting us in other ways.

Join our community

To help with managing support we’ve created and online community. Please get involved as it will be a real help to maintaining the project and is a good way to get the latest news and updates and interact with other users about our services, SEO and other related matters.