I have been trying to web scrape data from the website soccerstats, specifically the football team "Arsenal's results (http://ift.tt/2r10bAd)
(there are several tables on the webpage, I am after the data in the biggest table)
My current code scrapes the innertext from any td tag which comes out a mess:
'start a new subroutine called SearchBot
Sub soccer_stats()
'dimension (declare or set aside memory for) our variables
Dim objIE As InternetExplorer 'special object variable representing the IE browser
Dim aEle As HTMLLinkElement 'special object variable for an <a> (link) element
Dim y As Integer 'integer variable we'll use as a counter
Dim result As String 'string variable that will hold our result link
Dim Variable1 As String
Variable1 = InputBox("put in what you are searching")
'initiating a new instance of Internet Explorer and asigning it to objIE
Set objIE = New InternetExplorer
'make IE browser visible (False would allow IE to run in the background)
objIE.Visible = True
'navigate IE to this web page (a pretty neat search engine really)
objIE.navigate "http://ift.tt/WwXLSl"
'wait here a few seconds while the browser is busy
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
Dim ele As Object
For Each ele In objIE.document.getElementsByTagName("input")
If ele.Name = "searchstring" Then
ele.Value = Variable1
End If
Next ele
For Each ele In objIE.document.getElementsByTagName("input")
If ele.className = "submit" Then
ele.Click
End If
Next ele
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
For Each ele In objIE.document.getElementsByTagName("a")
If ele.innerText = Variable1 Then
ele.Click
End If
Next ele
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
'new bit
y = 2
For Each ele In objIE.document.getElementsByTagName("td")
'...get the innertext and print it to the sheet in col A, row y
result = ele
Sheets("Sheet2").Range("A" & y).Value = ele.innerText
y = y + 1
Next
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
End Sub
Is there a way to paste the inner text to Rows A, B, C, D if it meets conditions i, ii, iii, iv?
the first column of the table has html: <td height=”18” align=”right”> 14 Aug</td
So could I change my code to For Each ele In objIE.document.getElementsByTagName("td") AND height="18?
and for the next column in the table, the html code has no height, so could I change it to "For Each ele In objIE.document.getElementsByTagName("td") AND height=null?
or is there a better way to scrape the entire table? Thank you for your help
edit:
the html for each column in the webpage is: date column:
<td height=”18” align=”right”> 14 Aug</td
home team column:
<td align=”right”><b>Arsenal</b></td>
score column:
<td width=”45 align=”center”>
<a class=”tooltip2” href=”#”>
<font color=”#0000aa”>
<b>3 – 4</b>
away team column:
<td align="left">
Liverpool
</td>
Aucun commentaire:
Enregistrer un commentaire