My site has many page references that I wish to exclude from the crawl. Like http://mysite.com/mypage.asp?page=2.
So I would like to exclude any link that contains ?page= . I've tried it a number of ways but can't manage to get to work.
I can't imagine that this would involve anthing to do with regex. Just a simple string match.
Thanks in advance.
Randy Browning
Hi,
You probably already have, but check the output filters help :: Do Not List URLs That Match Paths / Strings / Regex (http://www.microsystools.com/products/sitemap-generator/help/website-crawler-output-filters/)
(Also make sure you disable easy sitemap generator mode (http://www.microsystools.com/products/sitemap-generator/help/easy-sitemap-generator-mode/) so you can see all the options.)
Have you tried adding:
:mypage.asp?page=
(with the colon in front for a "path match")
?
Actually ?page=
should work as well since that makes a "string/text match".
If it does not work, please email me (http://www.microsystools.com/home/contact.php) with your website address, and I will be happy to create a project file for you!
Easy mode is off. I see where it should be entered. Syntax just never seemed to right or something. Let me give your suggestion a try.
Thanks very much.
Randy Browning
In the article I linked arlier, there's also a demo project you can try that demonstrates output exclude filters using both regular expressions, string/text match and path match! :)
Saw the file but haven't tried it. I'll load it and take a look.
Thanks again.
Randy Browning