vendredi 30 octobre 2015

Does lynx -crawl -traversal still work?

10 years ago, Lynx could download a web page and the pages on the same site that it pointed to, using a command like this:

lynx -crawl -realm cscie12.dce.harvard.edu -traversal http://ift.tt/1is70CP

This doesn't work on my cygwin32 or cygwin64 systems, which have these respective versions:

$ lynx --version
Lynx Version 2.8.7rel.1 (05 Jul 2009)
libwww-FM 2.14, SSL-MM 1.4.1, OpenSSL 1.0.2d, ncurses 6.0.20151017(wide)
Built on cygwin May  8 2012 12:21:50

$ lynx --version
Lynx Version 2.8.7rel.1 (05 Jul 2009)
libwww-FM 2.14, SSL-MM 1.4.1, OpenSSL 1.0.2d, ncurses 6.0.20151017(wide)
Built on cygwin Apr 10 2013 12:32:36

I can't find any evidence that -crawl has worked in the past several years. When I try it now, I get errors like these:

$ lynx -traversal -crawl http://ift.tt/1is70CP
Unable to open traversal file.: No such file or directory

$ lynx -crawl -traversal http://ift.tt/1is71Xj
lynx: Start file could not be found or is not text/html or text/plain
      Exiting...

$ lynx -traversal -crawl -realm http://ift.tt/1is70CR -startfile_ok "http://ift.tt/1is71Xj"
lynx: Start file could not be found or is not text/html or text/plain
      Exiting...

I don't know if lynx is supposed to download the .mp3 files that I want, but I can't get it to download any web page at all, not even the start page. Does it just not work on cygwin, or is -crawl no longer supported, or is the 2009 version of lynx confused by changes in its libraries or in web protocols?

Is there some other simple tool to use for the common case of downloading a podcast's archives, where they are all .mp3 files that are each on a separate .html page that is linked to by a single index page?




Aucun commentaire:

Enregistrer un commentaire