vendredi 28 avril 2017

Analyze Html Content - in Java

I was tasked to analyze content of Html page using Java, and export these web page data into well formated pdf. I have already set up my technology stack using Jsoup for parsing Html and Apache PDFBox for exporting this web page into pdf.

But where im stucked is, what data should i export,how do i find in prased Html which data are interesting which not , how to well format them etc. Are there any Studies, where i can read more about how to analyze html content , by priorities. Which data are important which are not. The only one i know is to analyze priorities by html tags is more prioritized then and so on.

Sorry for my bad english and thanks for help




Aucun commentaire:

Enregistrer un commentaire