I'm creating a super basic web scaper which will total up Revenue of all my customers in my company's web based CRM.
I'm using VB.NET.
Currently what I'm doing is logging in to my crm in a webbrowser element.
Navigating to a customer in the CRM and downloading the page source once the page has loaded.
Now the issue I have is getting the "revenue" value from the source as it's just plain text, it's not in a text field.
I have a friend who writes in Java and he was able to create a script which will read the page source from a text file and output a new file called "Revenue.txt" which will contain just the revenue number I'm after.
It works great but it's too many steps.
I'm logging in, navigating to the customer's page, downloading the webpage source to a text file, executing his script, waiting for the output and then finally reading the output from a text file.
My question is can anyone take a look at his script and perhaps suggest how to do the exact same thing in vb.net?
Thank you for your time, I really appreciate it.
Dim main As package
Imports java.io.BufferedWriter
Imports java.io.File
Imports java.io.FileOutputStream
Imports java.io.IOException
Imports java.io.OutputStreamWriter
Imports java.io.Writer
Imports org.jsoup.Jsoup
Imports org.jsoup.nodes.Document
Imports org.jsoup.nodes.Element
Imports org.jsoup.select.Elements
Public Class main
Public Shared FILE As String = (System.getProperty("user.dir") + "\"& vbTab&"est.txt")
Public Shared FILE_PATH As String = (System.getProperty("user.dir") + "\"& vbCr&"evenue.txt")
Public Shared revenue As String
Public Shared writer As Writer
Public Shared Sub main(ByVal args() As String)
' TODO Auto-generated method stub
Dim revenue As String = main.getWebContent
If (Not (revenue) Is Nothing) Then
Try
Dim fos As FileOutputStream = New FileOutputStream(New File(FILE_PATH))
writer = New BufferedWriter(New OutputStreamWriter(fos, "utf-8"))
writer.write(revenue)
writer.close
Catch ex As IOException
' Report
End Try
Else
End If
End Sub
Public Shared Function getWebContent() As String
Try
Dim input As File = New File(FILE)
Dim doc As Document = Jsoup.parse(input, "UTF-8", "http://app.companynameremoved.com/")
Dim newsHeadlines As Elements = doc.select(".company-overview .col-md-6 .div-toggled .col-md-6")
If Not newsHeadlines.isEmpty Then
Dim counter As Integer = 0
For Each line As Element In newsHeadlines
If (counter = 8) Then
revenue = line.text
End If
counter = (counter + 1)
Next
End If
Catch e As IOException
' TODO Auto-generated catch block
e.printStackTrace
Catch e As IndexOutOfBoundsException
e.printStackTrace
Catch e As NullPointerException
e.printStackTrace
End Try
Return revenue
End Function
End Class
Aucun commentaire:
Enregistrer un commentaire