HELP!!! Spent 3 DAYS analyzing website for sitemap it is REBOOTED!!!

  • 8 Replies
  • 3198 Views
*

camaro92

  • Newbie
  • *
  • 4
  • +0/-0
    • View Profile
1. I tried to make a sitemap using the Sitemap generator. it's advertised at in little as 20 seconds to make a sitemap which is NOT true. What is not told and now I discover that is that you have to analyze the website. Which is a bit over 20 seconds.. More like 3 DAYS!!

2. I ran the program to analyze the website and got thru 3 DAYS and then it just popped up an error and in a second before I could read it, the computer just REBOOTED. Now I return back to the program and see NO way to resume?!?!?!!!! Are you telling me that I just spent 3 DAYS (computer left on for 3 days straight) just to have your program crash and now I LOST IT ALL!?!?

*

Webhelpforums

  • Administrator
  • Hero Member
  • *****
  • 1406
  • +6/-0
  • Shared between Microsys, WebHelpForums and helpers
    • View Profile
    • Webmaster and Website Help Forums
Re: HELP!!! Spent 3 DAYS analyzing website for sitemap it is REBOOTED!!!
« Reply #1 on: February 15, 2011, 03:53:13 PM »
I don't think I am "advertising" it as 20 seconds anywhere although it's certainly a good estimate/example for small/normal websites. And it's certainly no secret that A1 Sitemap Generator has to crawl your website... How else would it find your pages? (It does not have direct access to a database containing all URLs)

A1 Sitemap Generator has to crawl your website. So if you website is either buggy and creates endless new URLs or you simply have e.g. a million page website then it should be no surprise it will take some time...

e.g. scanning http://www.computergameplayer.com with 31 (default is 5) simultaneous connections took 24 seconds to *analyze content* of 422 URLs. That is about 17.5 pages / second. Without too much effort configuring settings / stop listening to online radio etc. I could probably tweak that speed up (Or maybe set to 100 simultaneous connections in registered version. Not something I recommend though.) Also doesn't help the website is hosted on other side of the Atlantic sea. So overall that's a good example :)

If you actually had A1SG crawling for 3 days straight you either have:
1) buggy website
2) huge website

Possibly your computer crashed because it ran out of memory although it's pretty drastic :(

If you wish to use resume functionality, you should read documentation:
http://www.microsystools.com/products/sitemap-generator/help/sitemap-generator-resume-scan/

If you wish to increase amount of URLs A1SG can keep in memory while scanning:
http://www.microsystools.com/products/sitemap-generator/help/creating-sitemaps-large-websites/

But if your website is really THAT big... Then you probably need to get a custom solution for your website. (Something that reads your database? directly which is infinitely easier.) Maybe find a plugin?

If you believe your website is mid-size, say e.g. 10000-100000 URLs (or more for that matter) then normally there would not be any problem. A common reason for troubles like yours would be if your website is generating new URLs dynamicly (possibly even returning "200, All OK" response to those URLs). That leads to crawl continues forever. If you are interested in pursuing this, I will be glad to help you :)


Otherwise I wish you well in finding another solution more suitable for you! :)
« Last Edit: March 17, 2013, 01:05:25 PM by Webhelpforums »
TechSEO360 | MicrosysTools.com  | A1 Sitemap Generator, A1 Website Analyzer etc.

*

camaro92

  • Newbie
  • *
  • 4
  • +0/-0
    • View Profile
Re: HELP!!! Spent 3 DAYS analyzing website for sitemap it is REBOOTED!!!
« Reply #2 on: February 16, 2011, 11:07:27 AM »
The website itself is fairly small but has a phpbb forum which consist of over 10,000 users and over 130,000 Posts.. I now set it for 31 connections but going on 5 hours and it's still going..

*

Webhelpforums

  • Administrator
  • Hero Member
  • *****
  • 1406
  • +6/-0
  • Shared between Microsys, WebHelpForums and helpers
    • View Profile
    • Webmaster and Website Help Forums
Re: HELP!!! Spent 3 DAYS analyzing website for sitemap it is REBOOTED!!!
« Reply #3 on: February 16, 2011, 01:13:47 PM »
Forum websites are sometimes due to database load (since each page request execute the same SQL queries against the database backend)

However, I recommend you first check how to *best* use resume:
http://www.microsystools.com/products/sitemap-generator/help/sitemap-generator-resume-scan/
Then stop, save your project and then use resume later.

Consider dropping extended data:
http://www.microsystools.com/products/sitemap-generator/help/creating-sitemaps-large-websites/
(saves memory although with "just" e.g. 150000 URLs it *should* most often *not* be necessary)

...

Also, you may be able to cut down on URLs. E.g. if you don't need member pages? you could add URL filters for both *output* and *analysis*. Doing that will speed up your crawl and save memory. (Same goes if you can avoid e.g. duplicate URLs you don't really need to *analyze* and have *output* to sitemap)

http://www.microsystools.com/products/sitemap-generator/help/website-crawler-output-filters/
http://www.microsystools.com/products/sitemap-generator/help/website-crawler-scanner-filters/

(remember add URLs to both)

...

What I do when I need to handle such big websites that also may create multiple unwanted URLs leading to same content etc. I first take a few test scans (e.g. 1000 URLs) and see if there is something I don't want. It avoids situations where a scan takes forever and forever because of some unknown reason.

...

I am considering adding some more presets for common websites, e.g. wordpress, phpbb etc. Maybe I should prioritize getting those done :)
TechSEO360 | MicrosysTools.com  | A1 Sitemap Generator, A1 Website Analyzer etc.

*

camaro92

  • Newbie
  • *
  • 4
  • +0/-0
    • View Profile
Re: HELP!!! Spent 3 DAYS analyzing website for sitemap it is REBOOTED!!!
« Reply #4 on: February 16, 2011, 09:07:28 PM »
One of your competitors has a feature where you select if you are running phpbb and in automatically inserts the following to bypass the endless URLs that phpbb and other forums like to use.


Quote
Exclude URLs:
p=
mode=
mark=
order=
highlight=
profile.php
privmsg.php
posting.php
view=previous
view=next
search.php


Do not parse URLs: view=print

Can your program do this? Thanks


(Note edited your quoted text to less to avoid forum infringing on any possible content/copyright. Just to be safe!)
« Last Edit: February 17, 2011, 02:21:10 AM by Webhelpforums »

*

Webhelpforums

  • Administrator
  • Hero Member
  • *****
  • 1406
  • +6/-0
  • Shared between Microsys, WebHelpForums and helpers
    • View Profile
    • Webmaster and Website Help Forums
Re: HELP!!! Spent 3 DAYS analyzing website for sitemap it is REBOOTED!!!
« Reply #5 on: February 16, 2011, 09:52:48 PM »
Sure. You won't find anything more flexible for filtering than A1 Sitemap Generator. Well, my opinion anyways, but please do check out the output+analysis filters documentation links in my earlier post :)

Only thing is that I don't have a preset for phpbb :)
But really, only thing you need do in A1SG compared to default settings is to add those paths listed to the output + analysis filters. That's it :) there's just no preset for it at present.


EDIT:
I will have something in 3.1.2 beta (!) later today :)
« Last Edit: February 17, 2011, 01:44:06 AM by Webhelpforums »
TechSEO360 | MicrosysTools.com  | A1 Sitemap Generator, A1 Website Analyzer etc.

*

Webhelpforums

  • Administrator
  • Hero Member
  • *****
  • 1406
  • +6/-0
  • Shared between Microsys, WebHelpForums and helpers
    • View Profile
    • Webmaster and Website Help Forums
Re: HELP!!! Spent 3 DAYS analyzing website for sitemap it is REBOOTED!!!
« Reply #6 on: February 17, 2011, 04:21:01 AM »
You can find new beta of 3.1.2 here:
http://www.microsystools.com/products/sitemap-generator/betas.php

There's now a "phpBB" preset in:
Scan website | Quick presets... button

It seems to work in  a (very quick) test I made. If you have problems, feel free to write/PM/email your website address :)
TechSEO360 | MicrosysTools.com  | A1 Sitemap Generator, A1 Website Analyzer etc.

*

camaro92

  • Newbie
  • *
  • 4
  • +0/-0
    • View Profile
Re: HELP!!! Spent 3 DAYS analyzing website for sitemap it is REBOOTED!!!
« Reply #7 on: February 18, 2011, 06:18:03 AM »
I tried the beta version but I don't see anything new when you select the phpbb quick-set. None of the toggles are changed, it just seems like an unlinked form button that doesnt do anything.

How we we know what was even selected? I don't see anything indicating what was selected or what changes it made, if any.

*

Webhelpforums

  • Administrator
  • Hero Member
  • *****
  • 1406
  • +6/-0
  • Shared between Microsys, WebHelpForums and helpers
    • View Profile
    • Webmaster and Website Help Forums
Re: HELP!!! Spent 3 DAYS analyzing website for sitemap it is REBOOTED!!!
« Reply #8 on: February 18, 2011, 07:37:40 AM »
If you click the phpBB preset button,
the exclude section in both analysis + output filters get added a lot of items.

That way, you don't need to add them yourself.


All done was click the phpBB item in the Presets... menu.
As you can see Output filters got lots of new excluded items added.
I also tested it actually work on a phpBB forum, but if you have any particular in mind, feel free to let me know the URL.
« Last Edit: March 17, 2013, 01:06:43 PM by Webhelpforums »
TechSEO360 | MicrosysTools.com  | A1 Sitemap Generator, A1 Website Analyzer etc.

 




See Our Webmaster Tools for Windows and Mac

A1 Sitemap Generator
      
A1 Website Analyzer
      
A1 Keyword Research
      
A1 Website Download
      
A1 Website Search Engine
      
A1 Website Scraper