I want to download PDFs from certain locations of a website. There is a main page which has links to sub-pages along with thousands of other links. All PDF links to be downloaded are on sub-pages.
The website is very huge with multilevel links and thousands of links at each level.
I want to optimise downloading using WGET so that
- only two levels are considered - main-page & sub-page.
- Only specific type of links are picked on main-page.
- Folders are named based on the link name on main-page
URL pattern for main page and sub page given below.
Main Page ->
- Page 1 (PDF Link 1 + PDF Link 2 + lots of other links)
- Page 2 (PDF Link 1 + PDF Link 2 + lots of other links)
- ....... so on
URL Patterns
- Main Page (https:// foo.com / mainpage)
- Sub Page(https:// http://ift.tt/2lxY7xg)
- PDF (https:// http://ift.tt/2mbaHzl, https:// http://ift.tt/2lxYlEI)
Thanks
Aucun commentaire:
Enregistrer un commentaire