To build an html site map for an intranet site

Started by laurents, January 24, 2014, 07:24:57 AM

laurents

Hello, I would like to do what is written in the title ; so, is it possible to use a non-externally visible URL, and secondly, I've tried to use this tutorial : "Create HTML Sitemaps - HTML Sitemap Tutorial" (on an internet website as I hadn't the answer for the first question, and with the free version of the software), but it was a failure :( (after having scanned the site, it was very far to have found all the pages (only the homepage)) ; is there another method ?

Webhelpforums

Hi,

if I understand your question correctly, you tried to use the trial of A1 Sitemap Generator to scan a public accessible domain/website, but you do not feel it found all page URLs?

Is that correctly understood? If so, try check this help page:
http://www.microsystools.com/products/sitemap-generator/help/sitemapper-crawl-sitemap-links/

If you still have problems, please give en example here of an missing page URL.

Or if you prefer, email support directly:
http://www.microsystools.com/home/contact.php
TechSEO360 | MicrosysTools.com  | A1 Sitemap Generator, A1 Website Analyzer etc.

laurents

Hello,

with the example I have chosen, after having read the page http://www.microsystools.com/products/sitemap-generator/help/sitemapper-crawl-sitemap-links/

, I think there might be 2 pbs :
- the URL doesn't include "www" : http://vercorshandisport.org ;
- in the "index.html" file, there is a redirection : <meta http-equiv="refresh" content="0; URL=http://vercorshandisport.org/essai2/topic/index.php; charset=utf-8">

do you think the pb is there, and if yes, can I do sth ?

And moreover, you answer to the second part of my question but not to the first one (intranet).

Webhelpforums

Hi,

Recent version of A1 Sitemap Generator will automatically alias "www" and "non-www" URLs. In the past, one had to add the aliases into the configuration, but now, when using default settings, it happens automatically.

A1 Sitemap Generator also defaults to follow meta refreshes.

If you enter vercorshandisport.org into Firefox it redirects to
vercorshandisport.org/essai2/topic/index.php; charset=utf-8
That page returns "404 - not found"

I tried scanning using A1 Sitemap Generator 5.0.4 using default settings and it appears to me the scan was fully successful with many URLs found including: vercorshandisport.org/essai2/topic/index.php (!)

And yes, A1 Sitemap Generator, as a general rule, also works on websites placed in intranet. As long as it can connect to the intranet websites using HTTP (or HTTPs, but that requires a little configuration) - same as if you can browse the intranet website in e.g. FireFox.
TechSEO360 | MicrosysTools.com  | A1 Sitemap Generator, A1 Website Analyzer etc.

laurents

I'm sorry ; I was thinking to a notification (I'd forget to click) and hadn't been there until today and so, hadn't seen the answer...This error 404 is surprising : http://www.heberger-image.fr/images/27341_slide0001_image002.gif.html
If I scan the site (with A1 Sitemap Generator free version V5.0.4), it's quite short : http://www.heberger-image.fr/images/74444_slide0002_image001.png.html

Webhelpforums

If you look in the A1 Sitemap Generator screenshot, you see it is not a 404 error, but instead a timeout error which would indicate the website may have been under heavy load, down or similar when you tried to scan it. You can increase timeout values in "Scan websitet |crawler engine" - but I ran a scan without any problems using default settings.
TechSEO360 | MicrosysTools.com  | A1 Sitemap Generator, A1 Website Analyzer etc.

laurents

According to your answer, it isn't necessary to increase this timeout, but you mention also have found many URLs, but that's not my case ; I'm doing "scan website" and then I'm looking at "analyse website" : isn't it the right way ?

Webhelpforums


QuoteAccording to your answer, it isn't necessary to increase this timeout

In my case it was not necessary


QuoteI'm doing "scan website" and then I'm looking at "analyse website"

Yes.

If you see the screenshot you gave, you can clearly see A1 Sitemap Generator timeout happened. To understand both HTTP and A1 Sitemap Generator response codes see: http://www.microsystools.com/products/sitemap-generator/help/server-http-response-codes/. (It it is important to understand the difference between e.g. "404 - not found" and a timeout since their reasons and solutions are entirely different.)

Do you still get timeout when scanning the website mentioned in the screenshots you linked to? Do you have the problem with all websites you scan? If so: Do you ny any chance have some internet security software installed that inspects and blocks traffic?
TechSEO360 | MicrosysTools.com  | A1 Sitemap Generator, A1 Website Analyzer etc.

laurents

As english isn't my mother tongue (but french), to fully understand your answer, I've used Google translator ...and you're right ! Because now, I'm at home and there isn't anymore traffic blocks : 1202 links instead of only 4 ! And to understand the response code, I've had a look to your link and I have a question : as I have plenty of 0 ; rcVirtualItem and that for this, it's written "You can force check such URLs by checking option: Scan website | Crawler options | Always scan directories that contain linked URLs.", I would like to do that, but in "Scan website", I've only "Quick presets", but no "Crawler options" : so is it reserved only for paid versions ?

Webhelpforums

 A1 Sitemap Generator defaults to hide options. But you can have them shown by switching "easy mode" off:
http://www.microsystools.com/products/sitemap-generator/help/easy-sitemap-generator-mode/

If you have problems crawling from a specific computer, it may be a firewall / internet program that inspects, filters and blocks traffic. See: http://www.microsystools.com/products/sitemap-generator/help/site-map-creator-firewall/
TechSEO360 | MicrosysTools.com  | A1 Sitemap Generator, A1 Website Analyzer etc.

laurents

I've tried to scan an intranet site, but only 3 links are found and for the first one: 400  bad request and in http://www.microsystools.com/products/sitemap-generator/help/server-http-response-codes/, I read "See rcTimeoutConnect: Timeout: Generic for possible cause and solution." So what can I do ?

Webhelpforums

Some quick questions:

Can you browse the intranet website using Firefox + Internet Explorer?
If no: Site has problems

Does the website use https:
If yes: Need to configure A1SG for https

It this from the same computer that has problems scanning other websites?
If yes: Most likely same problem - see this help page:
http://www.microsystools.com/products/sitemap-generator/help/site-map-creator-firewall/

Is the intranet site access hrough either http:// or https://
If no: That is a problem. I will need more details then.




Other things to try.

a)
Try the two configurations mentioned here
http://www.microsystools.com/products/sitemap-generator/help/sitemap-generator-website-platform/
Section "General Solutions to Problematic Websites"

b)
Try scan from another computer on your intranet
(This, however, will likely not make a difference. If this is a computer network, all computers and the entire intranet is probably configured the same way wih reagrds to network trafic filtering and blocking.)
TechSEO360 | MicrosysTools.com  | A1 Sitemap Generator, A1 Website Analyzer etc.

laurents

I'm not very happy because I've begun an answer and (I don't know why) it has been lost and my computer has been reseted ! So I answer again...

...partially because I am on the location of my company and the firewall of the computer network is too much strong and blocks. I can only do trials when I'm at home, so connected at distance.  But I can nevertheless give some answers :
- no https ;
- effectively, the same computer than the one which whom I had blocking pbs and, as it's abnormal, I've interrogated the computer support with your link (http://www.microsystools.com/products/sitemap-generator/help/site-map-creator-firewall/) and I'm  waiting for an answer ;
- I will follow the advises of point a) when I will be back at home ;
- point b) : impossible to use another computer than mine, when I am at home...

I will give you news tonight (in France).

laurents

Hello ; I've done a trial at home (for me, it's 23:45...). The results seem a little better, but still :
path "http://ocp.schneider-electric.com/Global/OCP/OCPHTML.nsf/pages/Offer+Creation+Processes+Home/$File/OCP-index.html" ; items "3" ; Response code "400" and response description "Bad Request"...
and only 2 links 200-OK
so, not very satisfying...

and to answer to another question : the intranet website can be opened with IE or FF...

Webhelpforums

Assuming it is not password protected or similar, I think the only thing left to try is:

Try the two configurations mentioned here
http://www.microsystools.com/products/sitemap-generator/help/sitemap-generator-website-platform/
in section "General Solutions to Problematic Websites"

Some websites have modules installed that try to detect unknown crawlers and block them. Above configurations
can often solve that by making A1 Sitemap generator identify itself as e.g. a web browser.

Sorry I can not be of more help regarding your intranet website, but it is almost impossible when I can not check and crawl it myself :(
TechSEO360 | MicrosysTools.com  | A1 Sitemap Generator, A1 Website Analyzer etc.

laurents

I don't know from what country you are, but if hours can be compatible, it's maybe possible (if moreover you have time for that) to take control on my PC with Teamviewer...

Webhelpforums

I will give it some thought - could you drop me an email here: http://www.microsystools.com/home/contact.php - and I will try think of a solution :)
TechSEO360 | MicrosysTools.com  | A1 Sitemap Generator, A1 Website Analyzer etc.

More About Our Webmaster Tools for Windows and Mac

HTML, image, video and hreflang XML sitemap generatorA1 Sitemap Generator
      
website analysis spider tool for technical SEOA1 Website Analyzer
      
SEO tools for managing keywords and keyword listsA1 Keyword Research
      
complete website copier toolA1 Website Download
      
create custom website search enginesA1 Website Search Engine
      
scrape data into CSV, SQL and databasesA1 Website Scraper