Webmaster Forums - Website and SEO Help

Microsys Products and Webmaster Tools => A1 Sitemap Generator => Topic started by: eran_more on July 07, 2014, 01:08:40 AM

Title: thank-you pages unnecessary exposed?
Post by: eran_more on July 07, 2014, 01:08:40 AM
Hi,
I have thank-you pages which should not be 'exposed' to crawlers.
But still, mycrosys crawling 'finds' them.
Can you please explain why?
Here is one example: " http://www.example.com/thank-you-payment-one-day "
By the way, it's a Wordpress website - are there like 'hidden directories' back-doors?
Thanks,
Eran.
Title: Re: thank-you pages unnecessary exposed?
Post by: Webhelpforums on July 08, 2014, 04:44:36 AM
Before you crawl your website, switch off "easy mode":
http://www.microsystools.com/products/sitemap-generator/help/easy-sitemap-generator-mode/ (http://www.microsystools.com/products/sitemap-generator/help/easy-sitemap-generator-mode/)

Then in "Scan website | Crawler options" uncheck
"Apply webmaster and output filters after website scan stops"

After the scan, you can then see that:

is used by:

To investigate further see help page:
http://www.microsystools.com/products/sitemap-generator/help/sitemaps-generator-analyze-links/ (http://www.microsystools.com/products/sitemap-generator/help/sitemaps-generator-analyze-links/)
and be sure to check "linked-by", "used-by" and "redirected-by" of each.

For reference, note that e.g. http://www.example.com/thank-you-payment-mini has code:
<meta name="robots" content="noindex,follow,noarchive,noodp,noydir" />

Which is why A1SG removes URL after scan when using default setings, but does follow links/references.
Those URLs among other things use prev/next (i.e. link tag which A1SG considers a kind of "use" and not "link" which is why you will find such references in "uses" and "used-by" tabs when analyzing internal linking)

...

Alternatively, you can enable logging in  "Scan website | Data collection"
and then reduce worker threads to one in "Scan website | Crawler engine"

That will slow the scan a lot through, but you can search results afterwards in a  text file.

...

And to answer your question, no, A1 Sitemap Generator does not utilize any "hidden" techniques to uncover URLs. It simply follows links and references to URLs :)
Title: Re: thank-you pages unnecessary exposed?
Post by: eran_more on July 08, 2014, 11:07:04 AM
Thank you for your quick reply!
It is probably the 'rel prev' and 'rel next' which linked those pages.