Macro to find duplicates based on more than one column

  • I have an Excel sheet (File 1) with huge employee data running into tens of thousand of rows. Sometimes I get sent another spreadsheet (File 2) with a few hundred employee data and I am asked to find if these employees exist in my database based on their name, mobile and email. Now I cannot run a vlookup because it will only pull out the same name from File 1, but it won't necessarily be the same employee. I need a macro to compare the name, phone, email for an employee in File 2 and tell me if there is an employee with the same details in File 1. These criteria keep changing too, sometimes I am asked to compare based on Surname, Maiden Name and so on...

  • Hello,


    With such a massive and important database ... don't you have a Unique Identifier ... :)

    If you feel like saying "Thank You" for the help received, do not hesitate to click the "Thumbs Up" icon, below, in the bottom right corner:)

  • You can imagine the frustration ?(. The only unique identifier I have is the serial number :D


    Question for you :


    Is your serial number TRULY UNIQUE ... ???

    If you feel like saying "Thank You" for the help received, do not hesitate to click the "Thumbs Up" icon, below, in the bottom right corner:)

  • Well.... even if your serial number is not logical ... BUT UNIQUE ... your problem is already solved ...;)

    If you feel like saying "Thank You" for the help received, do not hesitate to click the "Thumbs Up" icon, below, in the bottom right corner:)

  • Fair enough ...


    What about building your own 'smart' unique identifier ... to compensate for the stupidity of your IT department ... ???;(

    If you feel like saying "Thank You" for the help received, do not hesitate to click the "Thumbs Up" icon, below, in the bottom right corner:)

  • Can you post a sample of File 1 and File 2? You can de-sensitize the data if necessary and you can include a few dozen rows of data. Also, what do you want to do if there is an employee with the same details in File 1?

    You can say "THANK YOU" for help received by clicking the :thumbup: icon in the bottom right corner of the helper's post.
    Practice makes perfect. I am very far from perfect so I'm still practising.

  • Can you post a sample of File 1 and File 2? You can de-sensitize the data if necessary and you can include a few dozen rows of data. Also, what do you want to do if there is an employee with the same details in File 1?

    Please see attached sample file. I need to see if sheet1 employees exist in my database (sheet2) based on their name, father's name, mobile and email. For example, (Sl.1) from sheet one exists in sheet two as (Sl. 29.)

    If there is a match found, I just need them to be moved to Sheet 3 or a new book.

  • Hello,


    In your 'New Sheet' , the Employee Contact No 8565875655 appears twice (row 5 and row 9) with, for example, two different first names


    Could the combination of the the Employee Contact No with, say, the Family Name ... be your Unique Identifier ...?;)

    If you feel like saying "Thank You" for the help received, do not hesitate to click the "Thumbs Up" icon, below, in the bottom right corner:)

  • Hello,


    In your 'New Sheet' , the Employee Contact No 8565875655 appears twice (row 5 and row 9) with, for example, two different first names


    Could the combination of the the Employee Contact No with, say, the Family Name ... be your Unique Identifier ...?;)

    Currently, I just go for exact matches and reject anything that is even slightly different. Contact numbers don't repeat in my files, but since this is a dummy db I created just to demonstrate what I need, I've not noticed that. My bad. :saint:

  • Currently, I just go for exact matches and reject anything that is even slightly different. Contact numbers don't repeat in my files, but since this is a dummy db I created just to demonstrate what I need, I've not noticed that. My bad. :saint:


    Sadly in your Database, even by concatenating your 6 Columns B & C & D & E & F & G ... you cannot generate a Unique Identifier ...!!!=O

    If you feel like saying "Thank You" for the help received, do not hesitate to click the "Thumbs Up" icon, below, in the bottom right corner:)

  • Hello again,


    Understand you cannot share your real life data ...

    But

    If you were to rebuild your dummy database ... you should stick to both your actual coherence AND incoherence ...in order to replicate the underlying logic of the problems you are facing with your file ...

    If you feel like saying "Thank You" for the help received, do not hesitate to click the "Thumbs Up" icon, below, in the bottom right corner:)

  • Currently, I just go for exact matches and reject anything that is even slightly different. Contact numbers don't repeat in my files, but since this is a dummy db I created just to demonstrate what I need, I've not noticed that. My bad. :saint:


    Sorry ... but read your message # 14 way too quickly ...


    Did you say : " Contact numbers don't repeat in my files "


    If you really mean it ... look no further for your Unique Identifier ... !!!:)

    If you feel like saying "Thank You" for the help received, do not hesitate to click the "Thumbs Up" icon, below, in the bottom right corner:)

  • Hello again,


    Understand you cannot share your real life data ...

    But

    If you were to rebuild your dummy database ... you should stick to both your actual coherence AND incoherence ...in order to replicate the underlying logic of the problems you are facing with your file ...

    Agreed and noted Sir.

  • Make sure that both workbooks are open. Place this macro in File1 and run it from there. Change the sheet names and workbook name to suit your needs.

    You can say "THANK YOU" for help received by clicking the :thumbup: icon in the bottom right corner of the helper's post.
    Practice makes perfect. I am very far from perfect so I'm still practising.

  • Possibly...

    If I've been helpful, let me know. If I haven't, let me know that too.

  • Make sure that both workbooks are open. Place this macro in File1 and run it from there. Change the sheet names and workbook name to suit your needs.

    Thank you! I'm getting a runtime error "This key is already associated with an element of this collection".

    The debug points to - (dic.Add arr1(i, 1), Nothing)