dimanche 27 mars 2016

Open source Java web crawler?

I'm searching for an open sourced Java web crawler which can crawl a webpage(and child pages) looking for a specific HTML tag and CSS class (i.e. h2 tag and class "test"), I want to obtain the text in this h2 tag and the url crawled from a web page.

Any idea on of the existing tools which can be used for this purpose?




Aucun commentaire:

Enregistrer un commentaire