I use .htaccess to disable directory browsing on my sites (Options -Indexes), i.e. when you use the address http://www.mysite.com/directory1 and their is no index.html file, you get a "Forbidden" message from the apache server.
I am using A1 Sitemap Generator 3.0.4. In older versions I had no problems with these directories, but now get a R.Code 403 and a R.Desc Forbidden when I analyze a website. Note: There are other files in the directory which A1 handles correctly.
This is fine but I do not want these error urls included in the sitemaps, because it causes errors when search engine crawl such a site using the generated sitemap.xml..
How do I tell A1 Sitemap Generator to ignore these error in the generated sitemap?
While A1 Sitemap Generator will show you URLs that gave 403 response (forbidden) errors, it won't actually include them in the actual generated sitemap files. See: http://www.microsystools.com/products/sitemap-generator/help/sitemap-website-scan-errors/ (http://www.microsystools.com/products/sitemap-generator/help/sitemap-website-scan-errors/)
There was/is a special corner case when creating HTML sitemaps with errored-directory-URLs that contain non-errored-URLs. But I don't think that is what you are referring to?
Thanks for your prompt reply:
"it won't actually include them in the actual generated sitemap files"
This is my problem - the generated sitemap does include the error urls.
And you have recrawled your website + re-created sitemap file after you made the URLs respond with 403?
If so, please email me (http://www.microsystools.com/home/contact.php) your project file + URL example of 403 that you believe A1SG includes in generated sitemap files, and I will be happy to take a look today/tomorrow! :)
Thank you. This issue has been SOLVED!
I have recrawled the website (after i marked 'Always scan directories that contain linked URLs') + re-created sitemap file (I made the directory URLs without index.html file to respond with 403 a long time ago).
Good to hear problem has been resolved! :)
By the way, why don't you redirect 301
example.com/dir/index.html to example.com/dir/ ?
I think that is more common than the other way around!
Happy New Year!
Happy New Year Too!
Appreciate your advise!