Remove duplicates based on date

  • Hello, first time posting here,

    hopefully I'm posting this in the correct forum,

    I'm dealing with excel files containing material, marked with unique material number/id, for multiple plants. This creates overlap as multiple plants use the same material, however in different time.

    I need to remove duplicates, however in a way that duplicates get removed based on the date, starting with the oldest, until there are no duplicates. This would create a list on material, each stated once with unique number/id, regardless of plant, with the latest date.

    Thank you for your input!

  • Hello and Welcome to the Forum :)


    If you could attach a sample file ( with no confidential data ) ... it would make it easier to come up with a proposal ;)

    If you feel like saying "Thank You" for the help received, do not hesitate to click the "Thumbs Up" icon, below, in the bottom right corner:)

  • Attaching the file, very simplified version, will have to spend a bit more time to get rid of confidential data on the original.

    Files

    • example.xlsx

      (8.7 kB, downloaded 30 times, last: )
  • Thanks a lot for your sample file :)


    Why don't you take advantage of this test file ... to show what your expected result should like ...;)


    see attached ... a possible interpretation ...

    Files

    If you feel like saying "Thank You" for the help received, do not hesitate to click the "Thumbs Up" icon, below, in the bottom right corner:)

  • Sorry if I'm making this in too many steps than necessary. Attached is the result of the first example, highlighted values are those that are to be kept. Each unique material is present only once on the list - and when there are duplicates of the material, only the most recent one is kept (regardless of plant). Hopefully this makes sense. Thank you!

  • Let's try to clarify ...


    What you call Duplicates ...


    1. Is the combination of Plant & Material ?


    2. Once a list of Unique Combined items is determined ... you need to retrieve the most recent date ?


    Is this interpretation correct ... or not ...? :)

    If you feel like saying "Thank You" for the help received, do not hesitate to click the "Thumbs Up" icon, below, in the bottom right corner:)

  • The list in the example was creating by putting two plant's lists together.

    If there was just one plant, there would be just unique materials (each of the material just once), no duplicates. Each material would have one line, stating the plant (always the same, because it would be just from one plant) and the PO date. Note - this "to-be" original plant would contain just one PO for one material.

    But, given there are two plants which order independently and have overlap, some of the material is duplicated (is there twice). I need to remove one of the duplicates - and it has to be the older one. It doesn't matter for which plant it is, each material just has to be there "just once" with the most recent PO. Hopefully I'm making sense :) Thank you very much for your time already! I appreciate it.

  • Sorry ... but you are familiar with your own topic ... and I am not ...?(


    Take advantage of your test file ... to visually show ... with say 20 to 30 records ...


    1. The starting point


    2. The final expected outcome

    If you feel like saying "Thank You" for the help received, do not hesitate to click the "Thumbs Up" icon, below, in the bottom right corner:)