Hi folks,
I'm trying to set up a sitemap to encourage Google to index the content of a very large (1mil + records) ASP.Net site, particularly with an eye toward having Google notice that we have information on the various items listed.
A typical URL would be something like:
MySite/app/ItemDetail?ItemID=12345
(assume that the ItemID will run to 1mil+ records)
Is my best strategy to let it index for a day or two to iterate the site, then submit the entire list of URLs from the matches? Or is there a better way to go?
Also: What's the best way to automatically have the crawler exclude any URLs that would terminate in a login screen prompt (much of the site is intentionally inaccessible unless you're logged in--don't care if those parts are crawled/returned in search results).
Thanks!
-Pete
You may want to consider have A1SG crawl localhost version:
https://www.microsystools.com/products/sitemap-generator/help/xml-sitemap-generator-localhost/
You can also check this about crawling large websites:
https://www.microsystools.com/products/sitemap-generator/help/creating-sitemaps-large-websites/
If you are unhappy about certain URLs included in *final* XML sitemaps (A1SG may show your URLs post scan it knows not to include in sitemaps you can consider using "Output filers" which are very powerful for configuration:
https://www.microsystools.com/products/sitemap-generator/help/website-crawler-output-filters/
Regarding submitting: Just submit the sitemap index file generated (and upload all):
https://www.microsystools.com/products/sitemap-generator/help/xml-sitemaps-page-limit/