Copy pdf metadata (title, year, author...etc) into excel cells

  • Hello


    I have some pdf files (it is articles) in folder, and want to get metadata of each file to excel sheet, like (author, title, year ...)


    I am beginner, but I read about Acrobat library and I install it and added it to references.


    If some one guide me to the main steps I have to follow, I will be greatful, that will help me speed up learning process.


    Thanks

  • Welcome to the forum!


    You installed Adobe Pro? If so, you should have AcroExch object in VBE > Tools > References. I would post code that I found but I don't have Pro version installed on this computer. I like to test code before I post it.


    Are these metadata values the same as File property values as seen in File Explorer? Is so, you can use the GetDetailsOf() method. e.g.

  • Thank you Sir for your time


    - Metadata i need


    The desired metadata is shown in Adobe Acrobat when i go to File > Properties > Description. // Don't know if this can be accessed by codes"

    It differs from the metadata that appear when right click pdf file>properties on windows.



    - acrobat library i install is "Acrobat DC SDK" and it appears in references when added as "Adobe Acrobat 8.0 type library"
    Downloaded from:

    https://opensource.adobe.com/d…obatsdk/releasenotes.html


    and no AcroExch found in the list.


    I am Syrian and Syrian people are restricted, and don't have access to any adobe products or site, i used VBN to install that, could this affect the way it works?

  • Thanks again for your help.


    - Registry

    I found keys for AcroExch In registry.

    I can attach pictures of it if necessary


    - Debugging

    I copyied the code and used vbe debugge, no message pop-up


    Will try the code tomorrow.

    The link you posted is exactly what I aim to do.


    - Code steps


    If this works then can we say steps are like this:

    1- making loop to read each file metadata in that folder.

    (By Setting folder path with counter)

    2- write data to cells


    I really appreciate your help.

  • When you run that code, Debug.Print puts the results in the VBE, Immediate Window. Select View in VBE to view that window if not active.


    Post back if you need more help. Attaching a short example file with the column headings would help us help you a bit easier. Click the Attachments link at end of reply box.


    It takes a bit more work but I like to put all results into an array and then write to the worksheet once. Writing to sheet one cell at a time is very slow.

  • Hello again sir and thanks


    I install Adobe Acrobat pro DC 2021 after long searching, and i found library in references and checked it.


    I thought to try the code first for one pdf and do what you told me about Immediate Window.


    I copied the code and added sFile target to "ee:\aa.pdf" where i have the pdf i want to test


    I made some changes like moving some of the code to the macro i creat "Testing", because i didn't know how to lunch the code without that, and also tried with commandbutton and same results. did i make smth wrong with this move?


    ** The code return the pdf (strFileName, strNumPages, strCreator and strProducer), but no info about Title and other info (it returns "")



  • Oh sorry for that, mybad


    the code and every thing is working, but the pdf i used was empty of metadat

  • Thanks for your help.<3<3


    Will continue my code and I would appreciate if you could help me with any upcoming hurdles. :)

  • Sure, just attach the sample workbook with the header column names arranged the way you want for metadata.


    I was a little sick so I won't get to my work computer until tomorrow to check this. For the GetInfo() metadata, I will be curious to see if GetDetailsOf() would produce the same result.


    Adobe's other methods like .GetNumPages is nice to have.

  • Sorry to hear that, hope you be healthy and get well.


    All metadata values worked and I am so happy with results.


    - Maybe GetDetailsOf() will work with songs, because I used to see alot of details on explorer file properties window, while with pdf only size and date of creation and modify were shown.


    Will try to check GetDetailsOf()code tomorrow, hope I succeed :)

  • Hi again


    I tried the GetDetailsOf() code


    and i changed the "i" value my self tried from -1 to 20

    Code
    1. FileProperty = objFolder.GetDetailsOf(objFolderItem, i)


    I got details from -1 to 6 as in details in the picture with attachment


    and then 7,8,9 was empty

    10 gives Owner in my case Gresco HP pc




    what do you think could those details of title and author be stored in values more than 20


    actually i changed i my self, because i don't know how to show all results that GetDetailsOf() returns

  • You can iterate i=0 to 100 or such. I say 100 because the stack in Immediate window is limited. Then do 101 to 200. The number of properties varies by file type. 350 should get most all. If coded right, one can put "" as the file base name and it returns the property name. I have put all of that into a worksheet so loop count is not limited. I would use Debug.Print i, s. The "s" is the property value found. The "i" reminds you which index number the value came from.


    GetDetailsOf method is nice but of limited use to you it sounds like.


    File property information can also be found using a Shell() to an application like exiftool. It does not get all file types but for those that it does, it can read and sometimes write those properties back. PDF is one that it handles well. Most use it for media file types. Its command line switches can be short but powerful. It is a good tool to consider depending on the project.

  • Debug.Print i, s.

    Will try that.

    Alot of Thanks sir

    For a beginner, I am So glad with results.

    Even if it is copy and paste:S i have reached the data I want.


    Really appreciate your help.

    Wish you all the best

  • For a beginner you seem to grasp more than many.


    Unfortunately, my computer at work no longer has Acrobat. I will get an old Acrobat version 6 and see if it would work for metadata. I wish Adobe had not gone the way of leasing software.


    For the interim, I did some preparatory work for a GetDetailsOF approach. It does not apply to your needs but many concepts could be used in your project.

    Normally, when I use GetDetailsOf method, I only need a few properties/metadata. With the needs of your project in mind, I fleshed most of this out tonight. Put it into a Module, change pPDF value line in first sub and run it with a blank worksheet active. Cell B2 and to the right will have 322 columns filled. Some are blank because it has no property name for that index position.


    If you examine the code, you will see comments with some good tips. Debug.Print lines were left in as they are good for testing but also show how some things can be shown. I most always code like this.


    The ArrayList method offers some nice features that we can use later in the file iterative loop. e.g. We may just use say 5 property field values. We would get the names, and with IndexOf, we can get just the value from another ArrayList with the values in the same index order. Most people, just hard code the index in. 99.9% of the time, that would be sufficient. Howsoever, every once in a while, Microsoft does an upgrade that could impact the code.


    This conceptual approach also offers a more dynamic potential. e.g. Data Validation list with all property names in B2. When one is picked for the file in A2, then C2 will be updated with that property value. That way, 322 columns are not needed but readily found.


    Anyway, if you want to give it a run, change the pPDF value and run...


  • Thank you sir <3


    It seems so perfect, especially it is now writing into sheet

    so exited to see results, will check it with files i am sure it has the metadata i need.

  • Here is the GetDetailsOf version will ALL names and property values. Just modify the INPUTS commented block in first sub and run it. The inputs do not have to be just pdf files. As before, run with activesheet being empty/blank.


  • Every thing is perfect and it shows all properties names and values.


    I am so surprised with this amount of properties


    You really did all the code, and with notes all around.


    Your way in explaining codes and the time you give is absolutely enough to help even blind people to write a code.


    All that I can say is "So lucky are people on this site or where ever you are "

    BTW I searched history for this link, it is Adobe Acrobat pro DC 2021 install.

    Check the link, Actually it is the only way we can get new software.


    Ones software is installed, all libraries of acrobat appears in references automatically.


    Link removed by admin