iam tasked to iterate over all web portal web pages and index them in hierarchy:
- top level -1.1
- under top level - 1.1.1
- etc. 1.1.2
- etc. 1.2.1
My problem is that my code doesnt generate index level 1.2.1 otherwise it would generate 1.2.2, it doesnt generate 1.3.1 or 1.3.2 otherwise it starts with 1.3.3, etc. etc., I know Where the problem is but iam out of any ideas how to solve it. Iam posting my code below. Thanks guys.
private void recursiveLinkSearch(String webPage,int actualRecursionDepth,int numberlink,String previousnumberLink) {
try {
Document doc = Jsoup.connect(webPage).get();
uniqueLinks.add(webPage);
logger.info(webPage);
pageIndexes.put(webPage,previousnumberLink.concat((String.valueOf(numberlink)) ));
String actualNumberLink=previousnumberLink.concat(String.valueOf(numberlink)).concat(".");
if(getRecursionMode().equals(WebPortalMode.FULL) || actualRecursionDepth<getRecursionMode().getRecursionDepth()) {
for (Element record : doc.select("a")) {
String url = record.absUrl("href");
/** CHECK that the a href link is not to the element on the same page **/
url = avoidBookMarkedLinks(url);
if (!uniqueLinks.contains(url)) {
/** this would not allow me to to recursively acces to link from other domain **/
if (url.contains(getWebPortalDomain())) {
recursiveLinkSearch(url, actualRecursionDepth+1,numberlink,actualNumberLink);
numberlink++;
}
}
}
}
} catch (Exception e) { }
}
Aucun commentaire:
Enregistrer un commentaire