I am working on a small multi-threaded web crawler in Java (in no way a serious effort) but I noticed that there is some code inside a synchronized function which is not working.
synchronized void dumpURL(String url) throws IOException {
active_urls.add(url);
out.write(url + "\n");
links_generated++;
}
The code runs with no error, but when I open the file written to by out, there is no text in it. I replaced the line "out.write()" with "System.out.println(url)" and a series of urls were printed correctly. In addition, despite all the calls to active_urls.add(), this while loop:
while (links_generated < max_links_generated && active_urls.size() > 0){
CrawlTask task = new CrawlTask(active_urls.poll(), this);
taskManager.execute(task);
}
does not seem to be affected by the increase in active_urls.size(), instead it behaves as if the size of active_urls is just being decreased by poll(). In fact a debugger breakpoint confirms this. All the variables involved are declared volatile.
Is there something in the Java threads implementation I am not considering?
Aucun commentaire:
Enregistrer un commentaire