Download PDFs on redirect

Started by iusplus, September 13, 2015, 03:47:03 AM

iusplus

Hi. Sorry for my English. It's not so good yet.
Anyway... I've got a problem with a registered version of A1 Website Download.
When I try to download a website that contains redirect links to PDF documents, the software downloads broken and unusable PDFs.

You  can test this issue tryng to connect to www.sied.it:
- click on "Eventi" on the left column;
- click on "Corso Nazionale SIED 2015" in the middle of the page;
- choose one of the PDFs at the bottom of the page, for example "Locandina".

I've got more than 1000 PDFs to download and none of them work after using A1 Website Download.
Otherwise, is there a way to tell the software to save the files, downloaded after a redirect, with their original filenames? A1 Website Download renames them to MS_*.pdf. So I can not even recognize them...

Webhelpforums

Hi,

Renaming occurs because an URL can not be saved "as is" to disk - e.g. if the URL contains "?" A1WD will rename in a way where a filename collusion is guaranteed not to occur. See this help page for options regarding this:
http://www.microsystools.com/products/website-download/help/website-download-convert-links/

However you should still in the A1WD software be able to see what corresponds to what since you have both URL and filename on disk columns:
http://www.microsystools.com/products/website-download/help/website-gallery-image-downloader/

NB. I will look into the report about broken PDF files and report back :)
TechSEO360 | MicrosysTools.com  | A1 Sitemap Generator, A1 Website Analyzer etc.

Webhelpforums

This issue appears a little strange. I can see the PDF files are downloaded and one can browse through pages. However, the actual content is missing.

However since Adobe Reader can open the PDF files without error, it would seem the PDF files are complete.

First time I ever counter something like that, so I am not sure what to make out of it yet. Will report back when I have news.
TechSEO360 | MicrosysTools.com  | A1 Sitemap Generator, A1 Website Analyzer etc.

iusplus

Hi, my friend.
First of all, thank you for your support.
I don't know if it will be useful, but here you can find two versions of the same file:

http://archive.areaqualita.com/test/Consenso_Informato2014_5b.pdf
(the original document)

and

http://archive.areaqualita.com/test/MS_4708.pdf
(the same document downloaded using A1WD)

It's a really strange issue because the first file has a smaller size (1.082.235 bytes) and second one is bigger but without content (1.410.612 bytes). If I analyze them with a HEX editor, it seems that the file downloaded using A1WD doesn't contain appropriate NULL and special symbols.

Maybe a problem of charset if the files are downloaded after a redirect? Is it possible that the charset header of the cfm answer page causes a misinterpretation of the following data blocks? It could explain the bigger filesize. Or maybe it's just speculation.
Bye from Italy.

iusplus

Hi! Are there any good news about this issue?
Thanks in advance.
Have a good day!

Webhelpforums

TechSEO360 | MicrosysTools.com  | A1 Sitemap Generator, A1 Website Analyzer etc.

iusplus


More About Our Webmaster Tools for Windows and Mac

HTML, image, video and hreflang XML sitemap generatorA1 Sitemap Generator
      
website analysis spider tool for technical SEOA1 Website Analyzer
      
SEO tools for managing keywords and keyword listsA1 Keyword Research
      
complete website copier toolA1 Website Download
      
create custom website search enginesA1 Website Search Engine
      
scrape data into CSV, SQL and databasesA1 Website Scraper