I was tasked to analyze content of Html page using Java, and export these web page data into well formated pdf. I have already set up my technology stack using Jsoup for parsing Html and Apache PDFBox for exporting this web page into pdf.
But where im stucked is, what data should i export,how do i find in prased Html which data are interesting which not , how to well format them etc. Are there any Studies, where i can read more about how to analyze html content , by priorities. Which data are important which are not. The only one i know is to analyze priorities by html tags is more prioritized then and so on.
Sorry for my bad english and thanks for help
Aucun commentaire:
Enregistrer un commentaire