Announcement

Collapse
No announcement yet.

Scraping data from a website using VBA

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Scraping data from a website using VBA



    Hi Guys,

    I am a VBA newbie and hence request you to bear with me. I am trying to scrape data from different tables of a particular website: https://www.globalinnovationindex.or...=data-analysis. I did get a lot of useful codes on various forums (including this one), but am facing a problem which I am unable to solve by either web queries or VBA - when I select different parameters, the URL does not change and hence the code only fetches the data on the original page (i.e. Global Innovation Index). It is unable to fetch other parameters from the dropdown, like Government effectiveness, Press freedom etc.).

    I need to fetch the data for multiple parameters in this drop down (by keeping the linkage dynamic) and input it in different sheets of an Excel workbook; this needs to be done by a single click only.

    Can somebody help me out with this? The problem seems to be something similar to what teamkoso/Eric has posted in his January 2006 post here: http://www.ozgrid.com/forum/showthread.php?t=45105.

    Thanks in advance.

  • #2
    Re: Scraping data from a website using VBA

    Hi tazzy, welcome to Ozgrid

    try stepping through this code with F8 to see what's happening - this should help you:

    Code:
    Sub SO()
    
    Set IE = CreateObject("InternetExplorer.Application")
    With IE
        .Visible = True
        .Navigate "https://www.globalinnovationindex.org/content.aspx?page=data-analysis"
            WaitForLoad IE
            
        Set Opt = .Document.GetElementByID("ctl10_ctl02_lstIndex")
    
        Opt.Value = "Input"
        Opt.OnChange
            WaitForLoad IE
        Opt.Value = "GTI"
        Opt.OnChange
            WaitForLoad IE
        Opt.Value = "Efficiency"
        Opt.OnChange
            WaitForLoad IE
    
        Set Opt = Nothing
        
        .Quit
    End With
    Set IE = Nothing
    
    End Sub
    
    Function WaitForLoad(IE As Variant)
        While IE.ReadyState <> 4
            DoEvents
        Wend
    End Function

    Comment


    • #3
      Re: Scraping data from a website using VBA

      Please follow the forum rules and post links to your cross-post(s) on other forums.

      Comment


      • #4
        Re: Scraping data from a website using VBA

        Hi S O,Thanks for the warm welcome and a prompt reply.I had reached to a point where once I click on a "Download" rectangle shape on my excel, I am able to fire a macro which opens up the global innovation index website's data analysis page (URL mentioned in my original page). It further selects the sub-parameter of my interest from the dropdown menu (i.e. Press Freedom), and I can see that table open up. However, I stumble upon a roadblock right there - I am not able to DOWNLOAD this table onto my excel file without another "click". I want to avoid the second click. I tried the code you invested your time developing (thanks a ton for that), but it takes me to the website and then returns an error - maybe I am not doing something right. As I said, I am just a beginner with VBA. Let me know if it would help if I posted the other code here (the one I mentioned about in the last paragraph).I did also post my question at http://stackoverflow.com/questions/2...xcel-worksheet after posting it here, hoping for some response.Thanks

        Comment


        • #5
          Re: Scraping data from a website using VBA

          Hi tazzy - the data you are after is displayed as a table with rows, you can't neccessarily download the table - but you can pull the information from it, here's a short example that you can incorporate into your code:

          Code:
          With IE.Document
          
          '// There are 143 rows in the table - each row has
          '// an ID of ctl10_ctl02_gridData_DXDataRow followed by a the row number (from 0 to 142).
          '// Within each row is a set of cells that make up the data table - these cells don't
          '// have an ID, so we need to create a collection of all the cells using their html tag
          '// tag name and loop through them.
                              
          For i = 0 To 142
          
          lastrow = Range("A" & Rows.Count).End(xlUp).Row + 1
              Range("A" & lastrow).Value = .GetElementByID("ctl10_ctl02_gridData_DXDataRow" & i).GetElementsByTagName("td")(0).innerHTML
              Range("B" & lastrow).Value = .GetElementByID("ctl10_ctl02_gridData_DXDataRow" & i).GetElementsByTagName("td")(1).innerHTML
              Range("C" & lastrow).Value = .GetElementByID("ctl10_ctl02_gridData_DXDataRow" & i).GetElementsByTagName("td")(2).innerHTML
              Range("D" & lastrow).Value = .GetElementByID("ctl10_ctl02_gridData_DXDataRow" & i).GetElementsByTagName("td")(3).innerHTML
              '// Repeat for remaining columns in table.
          Next i
          
          End With
          As far as the errors are concerned - the code ran on my machine using IE11 on Win7 and it was fine - however another colleague ran this and had various errors. The only difference I could think of between the two machines was that one was 32 bit and one was 64 bit - not sure if this has some effect on the properties that can be accessed in IE, as there is usually a 32bit and 64bit version. Someone with a greater mind than mine would have to solve that one I'm afraid.

          Comment


          • #6
            Re: Scraping data from a website using VBA

            Hi S O,

            Thanks for your reply. I tried the code that you developed. While it does take me to the website (Global Innovation Index) successfully and further to the correct section ("Press Freedom"), but the table that it downloads in excel is the incorrect one. It still somehow downloads the "Global Innovation Index" data only. I am sorry - maybe I am doing some rookie mistake, and can use some help here.

            I am attaching two .txt files as a reference:
            1. The code that you developed - in the manner I have used
            2. The code that I had put together, referencing different fora and making some tweaks to suit my purpose


            Regards
            Attached Files

            Comment


            • #7


              Re: Scraping data from a website using VBA

              try adding another
              Code:
              Do   
              DoEvents
                  Loop Until MyBrowser.readyState = READYSTATE_COMPLETE
              section after the "onchange" event is being fired, otherwise the code will pull the data from the table before the page has loaded - meaning you will get the previous page's results.

              Comment

              Working...
              X