Parse specific part of HTML to column in Excel 2013

  • Re: Parse specific part of HTML to column in Excel 2013


    I am just starting to teach myself to write a code for Excel2013. Two days of reading has me more overwhelmed than anything, currently. Would someone just be willing to help me with a starting point? I've looked at a lot of codes similar to what I want, but most are using a tables vs. using a DIV tag. The following is a snippet of what I want to parse from a web site.


    <div class="row"><div class="column grid_1"><div class="pod coupon " data-podid="18055391" data-gridx="0" data-gridy="2">
    <div class="right"> <div class="pod-info">
    <h4 class="summary">50¢ OFF</h4>
    <h5 class="brand">DAWN®</h5>
    <p class="details">ONE Dawn® Hand Renewal, Power Clean, Bleach Alternative or Oxi (excludes trial/travel size)</p> </div>


    I want the data-podid=, <h4>, <h5> and <p> data(highlighted in red) each in their own column in Excel.


    I would be very grateful with a start that I can work with through trial and error for a better grasp

  • Re: Parse specific part of HTML to column in Excel 2013


    Your 'spec' has changed. For your OP, which involves JSON data, call a XMLhttp GET request to retrieve the data then extract the required data using VBA string functions such as Split and Mid.


    For your second post, depending on the web site, XMLhttp GET or Internet Explorer automation, with the Microsoft HTML object library to parse and extract the data.

  • Re: Parse specific part of HTML to column in Excel 2013


    Thank you. I decided the info I wanted was all available from the web page without using the JSON data.


    This is what I have started with, but I can't get pass a "compile error: user defined type not defined". and it highlights in yellow "Sub GetCoupons()" and in blue "Dim XMLHTTP As MSXML2.XMLHTTP"

  • Re: Parse specific part of HTML to column in Excel 2013


    That error means you need to set a reference to MS XML v6.0 in the VBA Tools - References menu.


    You also need a reference to MS HTML Object Library and:

    Code
    1. Dim doc As HTMLdocument
    2. Set doc = New HTMLdocument


    But the last 5 lines are wrong and you don't need XMLhttp anyway because you are using Winhttp, so you could get rid of XMLhttp altogether and change the last lines to this and similar:

    Code
    1. MsgBox doc.getElementsByClass("brand")(0).Value

    And where did you get getElementsByDIV from? I'm not aware of it being a valid method in the HTML object library.

  • Re: Parse specific part of HTML to column in Excel 2013


    What is the reason for using the MsgBox and do I need to make an argument with IF and Else, if I am just trying to get it to load the information into a column?
    The getElementsByDIV is my error. I get an error using next, if I change it to end I get another error. Thanks for your time.

  • Re: Parse specific part of HTML to column in Excel 2013


    Code
    1. Sub GetCoupons()
    2. With CreateObject("WINHTTP.WinHTTPRequest.5.1")
    3. .Open "GET", "[URL]http://www.coupons.com/coupons/?zip=77477[/URL]", False
    4. .send
    5. c00 = .responseText
    6. End With
    7. Cells(1).Resize(, 4) = Array(Val(Mid(c00, InStr(c00, "data-podid=") + 12)), Split(Mid(c00, InStr(c00, "summary"">") + 9), "<")(0), Split(Mid(c00, InStr(c00, "<h5 class=""brand"">") + 18), "<")(0), Split(Mid(c00, InStr(c00, "<p class=""details"">") + 18), "<")(0))
    8. End Sub
  • Re: Parse specific part of HTML to column in Excel 2013


    I have the following code working to pick up the first page of the website. At the bottom of the page is a "show more coupons" button. All total this site will do that approx. 30 times until all of the coupons are loaded. However the Macros runs and gets the first page only, I need to be able to tell it to load all of the coupons before it runs. I have the ready state coding in, but that doesn't help me load the rest of the pages as the button needs to be clicked on. Honestly I have no idea what to even call this function to even try and do a web search. Any guidance would be truly appreciated at this point.

  • Re: Parse specific part of HTML to column in Excel 2013


    Quote from snb;705898

    Did you see post #7 ?


    Yes I did thank you. I've looked up every function referenced in it and have no clue what to do with it. It doesn't seem to coincide with what I want to do and I feel like I need to make some ranges to tell it where to put the data in the spreadsheet. I have done a large amount of reading over the past couple of days, a little bit is starting to sink in, but quite a bit more is going over my head.

  • Re: Parse specific part of HTML to column in Excel 2013


    Hope you will humor me once again, I do understand now that is the actual code. Thank you.
    I had to remove the URL /URL tags as it was giving me an unrecognized protocol error; once they were remove the page does open. It only loads the first coupon though, doesn't even pick up the whole first page.
    I added the following code and with or without it made no difference. This is the Json for that page, if it makes a difference or not http://www.coupons.com/ajax/init?zip=77477. I believe I do have all the necessary libraries checked off in the tools reference I have. Thanks again for your help.
    VB for applications
    MS Excel 15 Object Library
    MS Office 15 Object Library
    OLE Automation
    MS HTML Objects
    MS Internet Controls
    MS Scripting Controls


  • Re: Parse specific part of HTML to column in Excel 2013


    I know I need to do something to get this code to list all of the data instead of a single line. The code starts the data in the first cell for the active sheet, so I don't think I need to tell it what sheet to write the data to. Can you give me an idea of my next sequence, to get the full list? Thanks
    Here is my current code, it continues to give me the first line only.

  • Re: Parse specific part of HTML to column in Excel 2013


    Cross-posted: http://www.mrexcel.com/forum/e…etrieve-data-website.html


    Please read http://www.excelguru.ca/conten…ge-to-forum-cross-posters and post links to your cross-posts.


    FWIW, I would go with XMLhttp and parsing the JSON string because, once you can follow its structure, it will be easier and much faster than automating IE. There is a VBA JSON parser at http://stackoverflow.com/quest…parsing-json-in-excel-vba.