Announcement

Collapse
No announcement yet.

VBA Code To Grab Multiple HTML Tables From Web Page

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • VBA Code To Grab Multiple HTML Tables From Web Page



    Hi Guys

    I obtained some VBA code from a very nice guy on another forum (For some reason I cannot access it at work anymore, but here is the link http://www.excelforum.com/excel-prog...-new-post.html)

    The code found my list of hyperlinks on Sheet1 and grabbed the table information from the respective URL and put it on Sheet2, all in line and beautiful. I then wanted to alter this code to suit another website, the Sporting Life, namely the racecards. However, the problems I'm having, being a novice at this, is how to change the code to grab more than just one bit of information.

    Here is an example, which is one of yesterdays races

    http://www.sportinglife.com/racing/racecards/08-10-2012/pontefract/racecard/528423/british-stallion-studs-supporting-british-racing-e-b-f-maiden-stakes


    The code below captures the list from Horse No.1 to Horse No.15, which is good but it is missing the Time "14:40 Pontefract" and the table below this, which is the race Title and Going etc.

    Code:
    Sub Grab_SL_Cards()
    Dim c As range
    Dim g
    With ActiveSheet
        For Each c In .range("A1", .Cells(.Rows.Count, "A").End(xlUp))
               g = GetTableSportingLife(c.Hyperlinks(1).Address)
               If Len(Sheets(2).Cells(1, 1).Value) = 0 Then Sheets(2).Cells(1, 1).Value = "."
               With Sheets(2).Cells(1, 1).CurrentRegion
                    .Offset(.Rows.Count).Resize(UBound(g), UBound(g, 2)).Value = g
               End With
        Next c
    End With
    
    End Sub
    Function GetTableSportingLife(url As String) As Variant
    Dim htm As Object, table As Object
    Dim data() As String, x As Long, y As Long
    Set htm = CreateObject("HTMLfile")
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", url, False
        .send
        htm.body.innerhtml = .responsetext
    End With
    With htm
        Set table = .getelementsbytagname("table")(0)
        ReDim data(1 To table.Rows.Length, 1 To 10)
        For x = 0 To table.Rows.Length - 1
             For y = 0 To table.Rows(x).Cells.Length - 1
                data(x + 1, y + 1) = table.Rows(x).Cells(y).InnerText
             Next y
        Next x
        
        GetTableSportingLife = data
        
    End With
    End Function
    So my question is simply, what needs changing in the code to collect all the tables/information from Race Time to the bottom of the racecard?

    Regards
    Paul

  • #2
    Re: VBA Code To Grab Multiple HTML Tables From Web Page

    Can you put a link on this to your thread in MrExcel please?

    Comment


    • #3
      Re: VBA Code To Grab Multiple HTML Tables From Web Page

      Originally posted by Kyle123 View Post
      Can you put a link on this to your thread in MrExcel please?
      Apologies for any confusion Kyle, I have posted my response on MrExcel, I hope that is OK? http://www.mrexcel.com/forum/newrepl...reply&t=663852

      Comment


      • #4
        Re: VBA Code To Grab Multiple HTML Tables From Web Page

        I'll answer here since MrE seems to be on deaths door.

        The data you are after is not in a table, so specifically which headings are you wanting to return?

        Comment


        • #5
          Re: VBA Code To Grab Multiple HTML Tables From Web Page

          Originally posted by Kyle123 View Post
          I'll answer here since MrE seems to be on deaths door.

          The data you are after is not in a table, so specifically which headings are you wanting to return?
          Ha Ha, I know what you mean, I am really struggling with access to MrE and ExForum, not sure why

          No wonder i was having problems trying to adjust your code, I did nto know..oops ! As per your previous question, I have Ie9 yes. As from your code which was greatly appreciated, I adjusted it to copy the race card, that is the number of horses in the race, from No.1 to whatever it is. However above this is the most important bit which is the race meeting, race time, going, value etc, it is the table (information) directly above the list of horses in the race, does that help?

          Regards
          Paul

          Comment


          • #6
            Re: VBA Code To Grab Multiple HTML Tables From Web Page

            no, not really Again, the data isn't in a table, so you need to be specific. I need a list of everything you want from that page, so far I've got:
            Race Meeting (location) - 14.10 Pontefract
            Race Going - Good to Soft

            Comment


            • #7
              Re: VBA Code To Grab Multiple HTML Tables From Web Page

              Yes, that's the kind of information Kyle. It's all well having the list of horses running but I need the time of race, title etc to go with them

              So, you will see the information as thus:

              14:10 Pontefract

              British Stallion Studs Supporting British Racing E.B.F.
              Maiden Stakes

              (2yo, 1m 2f 6y, Class 4, 6 runners)
              Winner 4,399 2nd 1,309 3rd 654 4th 327
              Going: Good to Soft
              Surface: Turf

              If it could import it all then that would be great, followed by the list of horses etc

              Regards
              Paul

              Comment


              • #8
                Re: VBA Code To Grab Multiple HTML Tables From Web Page

                Try this:
                Code:
                Function GetTableSportingLife(url As String) As Variant    
                    Dim htm As HTMLDocument, table As Object
                    Dim data() As String, x As Long, y As Long
                    Set htm = New HTMLDocument
                    With CreateObject("MSXML2.XMLHTTP")
                        .Open "GET", url, False
                        .send
                        htm.body.innerhtml = .responsetext
                    End With
                    With htm
                        Set table = .getelementsbytagname("table")(1)
                        ReDim data(1 To table.Rows.Length + 6, 1 To 10)
                        For x = 0 To table.Rows.Length - 1
                            For y = 0 To table.Rows(x).Cells.Length - 1
                                data(x + 7, y + 1) = table.Rows(x).Cells(y).innertext
                            Next y
                        Next x
                        
                        
                        data(1, 1) = .getElementsByClassName("header-nav")(0).NextSibling.innertext
                        data(2, 1) = .getElementsByClassName("content-header")(0).Children(0).innertext
                        Set table = .getElementsByClassName("list")(0)
                        For x = 1 To table.Children.Length
                            data(x + 2, 1) = table.Children(x - 1).innertext
                        Next x
                        GetTableSportingLife = data
                         
                    End With
                End Function
                You'll need to add a reference to Microsoft HTML Object Library, I can't get it to late bind properly

                Comment


                • #9
                  Re: VBA Code To Grab Multiple HTML Tables From Web Page

                  Added the reference Kyle, but it stuck here

                  Code:
                  ReDim data(1 To table.Rows.Length + 6, 1 To 10)

                  Comment


                  • #10
                    Re: VBA Code To Grab Multiple HTML Tables From Web Page

                    saying?

                    Comment


                    • #11
                      Re: VBA Code To Grab Multiple HTML Tables From Web Page

                      Originally posted by Kyle123 View Post
                      saying?
                      Oops...

                      Run-time error '91
                      Object Variable or With block variable not set

                      Comment


                      • #12
                        Re: VBA Code To Grab Multiple HTML Tables From Web Page

                        Using the url you provided?

                        Comment


                        • #13
                          Re: VBA Code To Grab Multiple HTML Tables From Web Page

                          My apologies, actually that worked with the link I gave you. I was using one of todays, so if you look a...

                          http://www.sportinglife.com/racing/r...handicap-div-1

                          The problem lies with the "Non-runner" at the end. I am only getting that in my list on Sheet2, not the horses above

                          Any clues?

                          Comment


                          • #14
                            Re: VBA Code To Grab Multiple HTML Tables From Web Page

                            Try changing
                            Code:
                            Set table = .getelementsbytagname("table")(0)
                            to:
                            Code:
                            Set table = .getElementById("racecard")
                            and remove all the #s from your urls

                            Comment


                            • #15


                              Re: VBA Code To Grab Multiple HTML Tables From Web Page

                              PARFAIT as the French say !!! Well, it worked this time, I'm sure I'll be knocking on your door again if it falls down somewhere <teehee>

                              OK..so this is the story

                              Part 1>>>>>

                              This code gets the Source Code for all Racecards of the day

                              Code:
                              Sub HTML_Source_Code_SL_Cards()
                              Dim FileName As String
                              Dim FileNum As Long
                              Dim Sh As Worksheet
                              FileName = "C:\Temp\Source.txt"
                              FileNum = FreeFile
                              Open FileName For Output As FileNum
                              Print #FileNum, GetSource("http://www.sportinglife.com/racing/racecards")
                              Close FileNum
                              Set Sh = Worksheets.add
                              With Sh.QueryTables.add(Connection:="TEXT;C:\TEMP\Source.txt", Destination:=range("A1"))
                              .Name = "Source"
                              .AdjustColumnWidth = True
                              .TextFileParseType = xlFixedWidth
                              .TextFileTextQualifier = xlTextQualifierDoubleQuote
                              .TextFileColumnDataTypes = Array(2)
                              .Refresh BackgroundQuery:=False
                              End With
                              End Sub
                              
                              Function GetSource(sURL As String) As String
                              Dim oXHTTP As Object
                              Set oXHTTP = CreateObject("MSXML2.XMLHTTP")
                              oXHTTP.Open "GET", sURL, False
                              oXHTTP.send
                              GetSource = oXHTTP.responsetext
                              Set oXHTTP = Nothing
                              End Function
                              Part 2>>>>> I have simple macros that find the racecard information and create the hyperlinks

                              Part 3>>>>>

                              Kyle's code

                              Code:
                              Sub Grab_SL_Cards()
                              Dim c As range
                              Dim g
                              With ActiveSheet
                              For Each c In .range("A1", .Cells(.Rows.Count, "A").End(xlUp))
                              g = GetTableSportingLife(c.Hyperlinks(1).Address)
                              If Len(Sheets(2).Cells(1, 1).Value) = 0 Then Sheets(2).Cells(1, 1).Value = "."
                              With Sheets(2).Cells(1, 1).CurrentRegion
                              .Offset(.Rows.Count).Resize(UBound(g), UBound(g, 2)).Value = g
                              End With
                              Next c
                              End With
                              
                              End Sub
                              Function GetTableSportingLife(url As String) As Variant
                              Dim htm As HTMLDocument, table As Object
                              Dim data() As String, x As Long, y As Long
                              Set htm = New HTMLDocument
                              With CreateObject("MSXML2.XMLHTTP")
                              .Open "GET", url, False
                              .send
                              htm.body.innerhtml = .responsetext
                              End With
                              With htm
                              Set table = .getElementById("racecard")
                              ReDim data(1 To table.Rows.Length + 6, 1 To 10)
                              For x = 0 To table.Rows.Length - 1
                              For y = 0 To table.Rows(x).Cells.Length - 1
                              data(x + 7, y + 1) = table.Rows(x).Cells(y).innertext
                              Next y
                              Next x
                              
                              
                              data(1, 1) = .getElementsByClassName("header-nav")(0).NextSibling.innertext
                              data(2, 1) = .getElementsByClassName("content-header")(0).Children(0).innertext
                              Set table = .getElementsByClassName("list")(0)
                              For x = 1 To table.Children.Length
                              data(x + 2, 1) = table.Children(x - 1).innertext
                              Next x
                              GetTableSportingLife = data
                              
                              End With
                              End Function
                              Kyle...thank you

                              Comment

                              Working...
                              X