i'm tasked to iterate over all links+sublinks of the given web portal. In most cases , when the web pages are not too complex and big i dont have any problems. The problem starts when i check links of a really complex site such as tutorialspoint and my computer just crash. I can't find any performance issue in code i attached, so can someone experienced tell me where in my code is a possible threat, where my computer crashes?
uniqueLinks collection is a HashSet for best perfomance for using contains.
private void recursiveLinkSearch(String webPage) {
/** ignore pdf**/
try {
logger.info(webPage);
uniqueLinks.add(webPage);
Document doc = Jsoup.connect(webPage).get();
doc.select("a").forEach(record->{
String url=record.absUrl("href");
if(!uniqueLinks.contains(url)) {
/** this would not allow me to to recursively acces to link from other domain **/
if(url.contains(getWebPortalDomain())) {
recursiveLinkSearch(url);
}
}
});
} catch (IOException e) {
e.printStackTrace();
}
}
Aucun commentaire:
Enregistrer un commentaire