Hey,
I've just started using A1 sitemap generator. But I'm having trouble filtering out certain results.
How do I filter out pages that all share a common word.
For example
www.mydomain.com/path/forum/index%7bnyhg
www.mydomain.com/path/forum/index#4gfjir
www.mydomain.com/path/forum/index)gjiueriii
What regex would I use to filter out the strings containing the word "index". I've tried a lot of combinations but it won't work for me. I can filter them out if the word exists on its own in the path name. But when its at the end of a string an has other random digits in the same line I cant seem to mass filter them.
Any help would be appreciated.
Great software.
Are you sure it is not a case sensitivity issue?
If you want to filter on a word, all you really should need to do is to write it as a string "example"
(no regular expressions necesary such as "::example")
Are you sure it's not because you are only excluding URLs in analysis filters (http://www.microsystools.com/products/sitemap-generator/help/website-crawler-scanner-filters/) and not also in output filters (http://www.microsystools.com/products/sitemap-generator/help/website-crawler-output-filters/)?
(By the way, # anchor fragments and #! Ajax URLs have special handling. Chances are you will not need to do any exclusions on such URLs when using default options.)
If you still have problems, email project file to support with full example URLs you want excluded:
http://www.microsystools.com/home/contact.php (http://www.microsystools.com/home/contact.php)
Yea that did the trick. I was only adding them to the analyzing filter. Out of curiosity what is the difference between the two filters. If I want every page and link from a domain crawled but I don't want certain results kept, do I just use the output filter. When does the output filter kick in? After the scan is completed or during the scan?
Quote from: o239666 on January 16, 2013, 06:34:51 AM
Yea that did the trick. I was only adding them to the analyzing filter.
:)
Quote from: o239666 on January 16, 2013, 06:34:51 AM
Out of curiosity what is the difference between the two filters. If I want every page and link from a domain crawled but I don't want certain results kept, do I just use the output filter.
Yes
Quote from: o239666 on January 16, 2013, 06:34:51 AM
When does the output filter kick in? After the scan is completed or during the scan?
Yes
...
(One small note: If you have an URL you don't want at all, you will usually gain a crawl speed improvement by excluding it in both types of filters.)