mercredi 3 octobre 2018

Convert simple Java web scraper to VB.NET [on hold]

I'm creating a super basic web scaper which will total up Revenue of all my customers in my company's web based CRM.

I'm using VB.NET.

Currently what I'm doing is logging in to my crm in a webbrowser element.

Navigating to a customer in the CRM and downloading the page source once the page has loaded.

Now the issue I have is getting the "revenue" value from the source as it's just plain text, it's not in a text field.

I have a friend who writes in Java and he was able to create a script which will read the page source from a text file and output a new file called "Revenue.txt" which will contain just the revenue number I'm after.

It works great but it's too many steps.

I'm logging in, navigating to the customer's page, downloading the webpage source to a text file, executing his script, waiting for the output and then finally reading the output from a text file.

My question is can anyone take a look at his script and perhaps suggest how to do the exact same thing in vb.net?

Thank you for your time, I really appreciate it.

Dim main As package
Imports java.io.BufferedWriter
Imports java.io.File
Imports java.io.FileOutputStream
Imports java.io.IOException
Imports java.io.OutputStreamWriter
Imports java.io.Writer
Imports org.jsoup.Jsoup
Imports org.jsoup.nodes.Document
Imports org.jsoup.nodes.Element
Imports org.jsoup.select.Elements
Public Class main

    Public Shared FILE As String = (System.getProperty("user.dir") + "\"& vbTab&"est.txt")

    Public Shared FILE_PATH As String = (System.getProperty("user.dir") + "\"& vbCr&"evenue.txt")

    Public Shared revenue As String

    Public Shared writer As Writer

    Public Shared Sub main(ByVal args() As String)
        ' TODO Auto-generated method stub
        Dim revenue As String = main.getWebContent
        If (Not (revenue) Is Nothing) Then
            Try 
                Dim fos As FileOutputStream = New FileOutputStream(New File(FILE_PATH))
                writer = New BufferedWriter(New OutputStreamWriter(fos, "utf-8"))
                writer.write(revenue)
                writer.close
            Catch ex As IOException
                ' Report
            End Try

        Else

        End If

    End Sub

    Public Shared Function getWebContent() As String
        Try 
            Dim input As File = New File(FILE)
            Dim doc As Document = Jsoup.parse(input, "UTF-8", "http://app.companynameremoved.com/")
            Dim newsHeadlines As Elements = doc.select(".company-overview .col-md-6 .div-toggled .col-md-6")
            If Not newsHeadlines.isEmpty Then
                Dim counter As Integer = 0
                For Each line As Element In newsHeadlines
                    If (counter = 8) Then
                        revenue = line.text
                    End If

                    counter = (counter + 1)
                Next
            End If

        Catch e As IOException
            ' TODO Auto-generated catch block
            e.printStackTrace
        Catch e As IndexOutOfBoundsException
            e.printStackTrace
        Catch e As NullPointerException
            e.printStackTrace
        End Try

        Return revenue
    End Function
End Class

Aucun commentaire:

Enregistrer un commentaire