jeudi 23 juin 2016

(web crawler) how to get the text of news passages from news website

i m going to get the text from a news website that i have to get around 1k website content

the link is on below : http://ift.tt/28Qtkl7

this website post every latest news and the new url is formed in adding 1 in the id

readnews.php?id=16727

so ,next url will be

readnews.php?id=16728

the question is i would like to scrape the text from 16000 to 17000

how to implement in Java

Jsoup? or other web crawler?

thanks




Aucun commentaire:

Enregistrer un commentaire