thank-you pages unnecessary exposed?

  • 2 Replies
  • 1795 Views
*

eran_more

  • Newbie
  • *
  • 2
  • +0/-0
    • View Profile
thank-you pages unnecessary exposed?
« on: July 07, 2014, 01:08:40 AM »
Hi,
I have thank-you pages which should not be 'exposed' to crawlers.
But still, mycrosys crawling 'finds' them.
Can you please explain why?
Here is one example: " http://www.example.com/thank-you-payment-one-day "
By the way, it's a Wordpress website - are there like 'hidden directories' back-doors?
Thanks,
Eran.
« Last Edit: July 08, 2014, 11:38:14 AM by Webhelpforums »

*

Webhelpforums

  • Administrator
  • Hero Member
  • *****
  • 1387
  • +6/-0
  • Shared between Microsys, WebHelpForums and helpers
    • View Profile
    • Webmaster and Website Help Forums
Re: thank-you pages unnecessary exposed?
« Reply #1 on: July 08, 2014, 04:44:36 AM »
Before you crawl your website, switch off "easy mode":
http://www.microsystools.com/products/sitemap-generator/help/easy-sitemap-generator-mode/

Then in "Scan website | Crawler options" uncheck
"Apply webmaster and output filters after website scan stops"

After the scan, you can then see that:
  • http://www.example.com/thank-you-payment-one-day

is used by:
  • http://www.example.com/thank-you-payment-mini
  • http://www.example.com/thank-you-payment-one-time

To investigate further see help page:
http://www.microsystools.com/products/sitemap-generator/help/sitemaps-generator-analyze-links/
and be sure to check "linked-by", "used-by" and "redirected-by" of each.

For reference, note that e.g. http://www.example.com/thank-you-payment-mini has code:
Code: [Select]
<meta name="robots" content="noindex,follow,noarchive,noodp,noydir" />
Which is why A1SG removes URL after scan when using default setings, but does follow links/references.
Those URLs among other things use prev/next (i.e. link tag which A1SG considers a kind of "use" and not "link" which is why you will find such references in "uses" and "used-by" tabs when analyzing internal linking)

...

Alternatively, you can enable logging in  "Scan website | Data collection"
and then reduce worker threads to one in "Scan website | Crawler engine"

That will slow the scan a lot through, but you can search results afterwards in a  text file.

...

And to answer your question, no, A1 Sitemap Generator does not utilize any "hidden" techniques to uncover URLs. It simply follows links and references to URLs :)
« Last Edit: July 08, 2014, 11:37:58 AM by Webhelpforums »
MicrosysTools.com | Website and SEO Software for webmasters | A1 Sitemap Generator, A1 Website Analyzer etc.

*

eran_more

  • Newbie
  • *
  • 2
  • +0/-0
    • View Profile
Re: thank-you pages unnecessary exposed?
« Reply #2 on: July 08, 2014, 11:07:04 AM »
Thank you for your quick reply!
It is probably the 'rel prev' and 'rel next' which linked those pages.

 




See Our Webmaster Tools for Windows and Mac

A1 Sitemap Generator
      
A1 Website Analyzer
      
A1 Keyword Research
      
A1 Website Download
      
A1 Website Search Engine
      
A1 Website Scraper