How to include linked pages, same site but other dir

  • 3 Replies
  • 2054 Views
*

spiweb

  • Newbie
  • *
  • 7
  • +0/-0
    • View Profile
How to include linked pages, same site but other dir
« on: January 28, 2012, 02:01:35 PM »
Hi there,

I am trying A1 Website Download 3.4.8 trial on Windows Vista, and I would like to download (for offline browsing) some pages from a web site where I contribute some content.
Let's say the site is www.example.org and I am interested in www.example.org/users/spiweb/ , so I set this URL as directory path in Scan website / Paths / Website ...

The scan works fine and correctly downloads a number of pages, let's say
www.example.org/users/spiweb/list_of_pages-A
www.example.org/users/spiweb/list_of_pages-B
www.example.org/users/spiweb/list_of_pages-C
and so on.

But then, I am also interested in the pages linked in listA, listB, listC, etc.,
Let's say listA links to
www.example.org/content/page3124
www.example.org/content/page6349
www.example.org/content/page7420
www.example.org/content/page9875

listB links to
www.example.org/content/page1130
www.example.org/content/page5434

and so on...

So I would like A1 Website Downloader to retrieve those HTML pages too, and the JPG images linked in there.

The problem is, they are on the same site BUT NOT in the /users/spiweb/ path.

Besides, I only want those specific pages, not the whole www.example.org/content/ directory, which is very large.


How do I do that?

Thank you!

« Last Edit: January 28, 2012, 02:07:13 PM by spiweb »

*

Webhelpforums

  • Administrator
  • Hero Member
  • *****
  • 1387
  • +6/-0
  • Shared between Microsys, WebHelpForums and helpers
    • View Profile
    • Webmaster and Website Help Forums
Re: How to include linked pages, same site but other dir
« Reply #1 on: January 29, 2012, 10:06:11 AM »
If you know what URL "areas" you want before starting the scan, you can do what you need by:

1) Disable "Easy mode"
http://www.microsystools.com/products/website-download/help/easy-website-download-mode/

2) Set root to www.example.org/

3) Set analysis filters (which URLs to analyze for links)
http://www.microsystools.com/products/website-download/help/website-crawler-scanner-filters/

4) Possibly set up more Start search paths:
http://www.microsystools.com/products/website-download/help/root-aliases-start-paths/

5) Set output filters (pages downloaded to disk)
http://www.microsystools.com/products/website-download/help/website-crawler-output-filters/

If you have very advanced needs, you will benefit from learning the basics of regular expressions since both the "limit-to" and "exclude" filtering options for both "analysis" and "output" filters support regex.
MicrosysTools.com | Website and SEO Software for webmasters | A1 Sitemap Generator, A1 Website Analyzer etc.

*

spiweb

  • Newbie
  • *
  • 7
  • +0/-0
    • View Profile
Re: How to include linked pages, same site but other dir
« Reply #2 on: February 14, 2012, 03:40:57 AM »
Thanks a lot for your explanation (and sorry for this late reply of mine), ...but I am a bit lost! I probably should try and study the help pages more than I did, but I couldn't find a solution so far. Anyway, you say "Set root to www.example.org/", but that would mean thousands of pages to scan in my case. I only need to download the pages in my personal area, let's say www.example.org/spiweb (and that's easy), PLUS the pages in www.example.org that are directly linked from a page of mine (and that I don't understand).
Yes I know RegEx, but I don't know how to apply them in my case. In Analysis Filters I can limit to or exclude URLs by using a RegEx, but I don't have a specific section of the site to download (apart from my own section), but just any page in the site directly linked from my pages. How do I do that? :)
Thanks again!

*

Webhelpforums

  • Administrator
  • Hero Member
  • *****
  • 1387
  • +6/-0
  • Shared between Microsys, WebHelpForums and helpers
    • View Profile
    • Webmaster and Website Help Forums
Re: How to include linked pages, same site but other dir
« Reply #3 on: February 14, 2012, 11:36:28 AM »
Can you email support with some exact URL examples?
http://www.microsystools.com/home/contact.php

...

Can't you either limit-include-to and/or exclude URLs using regular expressions?

Do also remember there's both "analysis" and "output" (a.k.a download-to-disk) filters.




MicrosysTools.com | Website and SEO Software for webmasters | A1 Sitemap Generator, A1 Website Analyzer etc.

 




See Our Webmaster Tools for Windows and Mac

A1 Sitemap Generator
      
A1 Website Analyzer
      
A1 Keyword Research
      
A1 Website Download
      
A1 Website Search Engine
      
A1 Website Scraper