Announcement

Collapse
No announcement yet.

Unconfigured Ad Widget

Collapse

Merge, Sort & Eliminate Close Duplicates

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merge, Sort & Eliminate Close Duplicates

    Hi,
    I am attaching a file with an example of a spreadsheet that I am trying to sort out. In this example I have 3 samples (I could have many more). Each sample has 8 columns associated with it (N, M, I, F, S, MS, KM and KD). The length of the dataset is different for each sample. The MS column is the same as M but contains a few zeroes. What I am trying to do is:

    1) generate one column (MSA) containing only unique values (no zeroes) from columns MS1, MS2 and MS3. The unique values should be selected within a specified tolerance (for example, 0.001, which makes 52.00706 from MS1 and 52.00701 from MS2 duplicate values although they are not exactly the same)

    2) generate 3 columns (named SS1, SS2 and SS3) with sorted columns S1, S2, and S3 so that for each value of MS in column MSA each of the three columns will list the corresponding value of S1, S2 and S3 (zero if there is no corresponding value)

    I hope my explanation is not very vague.
    I would greatly appereciate any input.

    Julia
    Attached Files

  • #2
    Re: Merge & Sort, Eliminating Duplicates With Tolerance

    Hi,

    Code:
    Sub Julia()
    Dim a, i As Long, j As Long, c  As Long, k  As Long
    Dim l  As Long, m   As Long, dic As Object, w()
    
    Set dic = CreateObject("scripting.dictionary")
    With Sheets("Sheet1") '<-adjust the sheet name and range
        a = .Range("a4", Sheets("Sheet1").Range("a3").SpecialCells(xlCellTypeLastCell))
    End With
    l = (UBound(a, 2) / 8) * 2: k = 1: m = 2
    ReDim w(1 To UBound(a, 1), 1 To l)
    For c = 6 To UBound(a, 2) Step 8
        For i = 1 To UBound(a, 1)
            If Not IsEmpty(a(i, c)) And a(i, c) <> 0 Then
                If Not dic.exists(a(i, c)) Then
                    j = j + 1: w(j, k) = a(i, c): w(j, m) = a(i, c - 1)
                    dic.Add a(i, c), w
                End If
            End If
        Next
        k = k + 2: m = m + 2: j = 0
    Next
    Set dic = Nothing
    On Error Resume Next
    Application.DisplayAlerts = 0
    Sheets("Results").Delete
    Application.DisplayAlerts = 1
    With Sheets.Add
        .Name = "Results"
        For i = 1 To l Step 2
            j = j + 1: .Cells(1, i) = "MSA" & j: .Cells(1, i + 1) = "SS" & j
        Next
        .Cells(2, 1).Resize(UBound(w), l).Value = w
    End With
    End Sub
    HTH
    Kris

    ExcelFox

    Comment


    • #3
      Re: Merge &amp; Sort, Eliminating Duplicates With Tolerance

      Thanks lot for the feedback!

      The sub that you wrote does part of the job that I am looking for. What I need to do in addition is to merge columns MSA1, MSA2 and MSA3 into one column and eliminate the duplicates within the tolerance of 0.001 (it may vary). I sorted the columns that I got on the results spreadsheet (and copied them to sheet 1 columns Z-AC to reduce the size of the file) to show what is the net result that I am looking for. I had to delete some of the MSA values that were duplicates. (For example, 90.09185 and 90.09198 are the same 96.0446 and 96.04467 are the same etc. So I left only one value in the final MSA column and copied the corresponding SS values from the rows that I was deleting.)

      Is this something that can be done with Excel?
      Attached Files

      Comment


      • #4
        Re: Merge &amp; Sort, Eliminating Duplicates With Tolerance

        OK. Try

        Code:
        Sub Julia()
        Dim a, i As Long, j As Long, c  As Long, k  As Long, l  As Long
        Dim m  As Long, dic As Object, w(), t(), Flg As Boolean, v
        
        v = 0.001 'set the tolerance here
        
        Set dic = CreateObject("scripting.dictionary")
        With Sheets("Sheet1") '<-adjust the sheet name and range
            a = .Range("a4", Sheets("Sheet1").Range("a3").SpecialCells(xlCellTypeLastCell))
        End With
        l = (UBound(a, 2) / 8) + 1: k = 2
        ReDim w(1 To UBound(a, 1) * l, 1 To l)
        For c = 6 To UBound(a, 2) Step 8
            For i = 1 To UBound(a, 1)
                If Not IsEmpty(a(i, c)) And a(i, c) <> 0 Then
                    If Not dic.exists(a(i, c)) Then
                        j = j + 1: w(j, 1) = a(i, c): w(j, k) = a(i, c - 1)
                        dic.Add a(i, c), w
                    End If
                End If
            Next
            k = k + 1
        Next
        Set dic = Nothing: Erase a
        On Error Resume Next
        Application.DisplayAlerts = 0
        Sheets("Results").Delete
        Application.DisplayAlerts = 1
        With Sheets.Add
            .Name = "Results"
            .Cells(1, 1) = "MSA"
            For i = 1 To l - 1
                .Cells(1, i + 1) = "SS" & i
            Next
            .Cells(2, 1).Resize(j, l).Value = w
            With .Cells(2, 1).Resize(j + 1, l + 1)
                .Sort .Range("a2"), xlAscending
                .Cells(1, l + 1).Formula = "=A2"
                .Cells(2, l + 1).Resize(UBound(w) - 1, 1).Formula = "=IF((RC[" & -l & "]-R[-1]C)>" & v & ",RC[" & -l & "],0)"
                a = .Value
            End With
            ReDim t(1 To UBound(a, 1), 1 To l): j = 0: k = 0
            For i = 1 To UBound(a, 1)
                If a(i, l + 1) <> 0 Then
                    j = j + 1: For c = 1 To l: t(j, c) = a(i, c): Next
                Else
                    For m = 2 To l
                        If Not IsEmpty(a(i, m)) And Not IsEmpty(a(i - 1, m)) Then
                            Flg = True: Exit For
                        End If
                    Next
                    If Flg Then
                        j = j + 1: For c = 1 To l: t(j, c) = a(i, c): Next
                    Else
                        For c = 1 To l
                            If IsEmpty(t(j, c)) Then t(j, c) = a(i, c)
                        Next
                    End If
                    Flg = False
                End If
            Next
            .Range(.Cells(2, 1), .Cells(2, 1).SpecialCells(xlCellTypeLastCell)).ClearContents
            .Cells(2, 1).Resize(j, l) = t
        End With
        End Sub
        Let me know how it works for you?
        Kris

        ExcelFox

        Comment


        • #5
          Re: Merge &amp; Sort, Eliminating Duplicates With Tolerance

          I am always amazed what you guys can do with Excel!
          Thanks a lot for your help. It seems that it was quite a challenge. There is still a few minor problems with removing the duplicates. I am attaching a file where the results obtained from the VBA subroutine are on the left and the results sorted manually are on the right. I highlighted the duplicate that were not removed by the subroutine. They all seem to reside in the 3rd (SS3) column. Because I do not really understand how it works I cannot troubleshoot this problem.

          Thanks again!
          Attached Files

          Comment


          • #6
            Re: Merge &amp; Sort, Eliminating Duplicates With Tolerance

            Hi,

            Run Main

            Code:
            Const v   As Single = 0.001
            Private Sub Julia1()
                Dim a, i As Long, j As Long, c  As Long, k  As Long, l  As Long
                Dim m  As Long, dic As Object, w(), t(), Flg As Boolean, n  As Long
                
                Set dic = CreateObject("scripting.dictionary")
                With Sheets("Sheet1") '<-adjust the sheet name and range
                    n = .Cells(4, Columns.Count).End(xlToLeft).Column
                    a = .Range(.Cells(4, 1), .Cells(.Cells(Rows.Count, 1).End(xlUp).Row, n))
                End With
                l = (UBound(a, 2) / 8) + 1: k = 2
                ReDim w(1 To UBound(a, 1) * l, 1 To l)
                For c = 6 To UBound(a, 2) Step 8
                    For i = 1 To UBound(a, 1)
                        If Not IsEmpty(a(i, c)) And a(i, c) <> 0 Then
                            If Not dic.exists(a(i, c)) Then
                                j = j + 1: w(j, 1) = a(i, c): w(j, k) = a(i, c - 1)
                                dic.Add a(i, c), w
                            End If
                        End If
                    Next
                    k = k + 1
                Next
                Set dic = Nothing: Erase a
                On Error Resume Next
                Application.DisplayAlerts = 0
                Sheets("Results").Delete
                Application.DisplayAlerts = 1
                With Sheets.Add
                    .Name = "Results"
                    .Cells(1, 1) = "MSA"
                    For i = 1 To l - 1
                        .Cells(1, i + 1) = "SS" & i
                    Next
                    .Cells(2, 1).Resize(j, l).Value = w
                    With .Cells(2, 1).Resize(j + 1, l + 1)
                        .Sort .Range("a2"), xlAscending
                        .Cells(1, l + 1).Formula = "=A2"
                        .Cells(2, l + 1).Resize(j, 1).Formula = "=IF((RC[" & -l & "]-R[-1]C)>" & v & ",RC[" & -l & "],0)"
                        a = .Value
                    End With
                    ReDim t(1 To UBound(a, 1), 1 To l): j = 0: k = 0
                    For i = 1 To UBound(a, 1)
                        If a(i, l + 1) <> 0 Then
                            j = j + 1: For c = 1 To l: t(j, c) = a(i, c): Next
                        Else
                            For m = 2 To l
                                If Not IsEmpty(a(i, m)) And Not IsEmpty(a(i - 1, m)) Then
                                    Flg = True: Exit For
                                End If
                            Next
                            If Flg Then
                                j = j + 1: For c = 1 To l: t(j, c) = a(i, c): Next
                            Else
                                For c = 1 To l
                                    If IsEmpty(t(j, c)) Then t(j, c) = a(i, c)
                                Next
                            End If
                            Flg = False
                        End If
                    Next
                    .Range(.Cells(2, 1), .Cells(2, 1).SpecialCells(xlCellTypeLastCell)).ClearContents
                    .Cells(2, 1).Resize(j, l) = t
                End With
            End Sub
            Private Sub Julia2()
            Dim ws  As Worksheet, Rng   As Range, l As Long, r  As Long
            Dim a, t(), j As Long, i As Long, c As Long, m  As Long
            Set ws = Sheets("Results")
            l = ws.Cells(1, Columns.Count).End(xlToLeft).Column
            r = ws.Cells(Rows.Count, 1).End(xlUp).Row
            Set Rng = ws.Range(ws.Cells(1, 1), ws.Cells(r, l + 1))
            With Rng
                .Cells(2, l + 1).Formula = "=A2"
                .Cells(3, l + 1).Resize(r, 1).Formula = "=IF(RC[" & -l & "]-R[-1]C>" & v & ",RC[" & -l & "],0)"
                a = .Value
                ReDim t(1 To UBound(a, 1), 1 To l)
                For i = 2 To UBound(a, 1)
                    If a(i, l + 1) <> 0 Then
                        j = j + 1: For c = 1 To l: t(j, c) = a(i, c): Next
                    Else
                        For m = 2 To l
                            If Not IsEmpty(a(i, m)) And Not IsEmpty(a(i - 1, m)) Then
                                Flg = True: Exit For
                            End If
                        Next
                        If Flg Then
                            j = j + 1: For c = 1 To l: t(j, c) = a(i, c): Next
                        Else
                            For c = 1 To l
                                If IsEmpty(t(j, c)) Then t(j, c) = a(i, c)
                            Next
                        End If
                        Flg = False
                    End If
                Next
                .Range(.Cells(2, 1), .Cells(2, 1).SpecialCells(xlCellTypeLastCell)).ClearContents
                .Cells(2, 1).Resize(j, l) = t
            End With
            End Sub
            Sub Main()
            Call Julia1
            Call Julia2
            End Sub
            HTH
            Kris

            ExcelFox

            Comment


            • #7
              Re: Merge &amp; Sort, Eliminate Duplicates With Tolerance

              I am struggling with the last code. The VBA gives me some errors and I cannot test it. I am using it as a module. Could this be a problem?

              Comment


              • #8
                Re: Merge &amp; Sort, Eliminate Duplicates With Tolerance

                Kris,
                I am trying to troubleshoot the previous version of the subroutine (Sub Julia), which is awfully close to the subroutine that I need.

                I found the most likely place of the failure:
                Code:
                .Cells(2, l + 1).Resize(UBound(w) - 1, 1).Formula = "=IF((RC[" & -l & "]-R[-1]C)>" & v & ",RC[" & -l & "],0)"
                This is the only place where the MSA values are compared within the tolerance and as far as I understand this is the weak point of the subroutine. Unfortunately, I do not understand how this part of the program works.

                What I found from Debugging is that if I have more than 2 non-unique MSA values, this function puts zero for the second one and then compares this new value (zero) with the next MSA value. This is the step that messes up the removal of the duplicates. To me the soution is very simple - just put a new temporary column for the MSA without duplicates so that you do not replace with zero the MSA value with which the comparison is performed.

                Could you, please, help me with this?
                Thanks a lot!
                Julia
                Last edited by ByTheCringe2; July 24th, 2007, 03:08.

                Comment


                • #9
                  Julia Laskin, Please use code tags for all VBA code.
                  .

                  Comment


                  • #10
                    Re: Merge &amp; Sort, Eliminate Duplicates With Tolerance

                    OK. Remove all the subs and try this.

                    Sub Julia_Final()
                    Dim a, i As Long, j As Long, c As Long, k As Long, l As Long
                    Dim m As Long, dic As Object, w(), t(), Flg As Boolean
                    Dim sFormula As String, v
                    v = 0.001 'set tolerance here
                    Set dic = CreateObject("scripting.dictionary")
                    With Sheets("Sheet1") '<-adjust the sheet name and range
                    n = .Cells(4, Columns.Count).End(xlToLeft).Column
                    a = .Range(.Cells(4, 1), .Cells(.Cells(Rows.Count, 1).End(xlUp).Row, n))
                    End With
                    l = (UBound(a, 2) / 8) + 1: k = 2
                    sFormula = "=IF(RC[" & -l & "]-R[-1]C[" & -l & "]> " & v & ",RC[" & -l & "],0)"
                    ReDim w(1 To UBound(a, 1) * l, 1 To l)
                    For c = 6 To UBound(a, 2) Step 8
                    For i = 1 To UBound(a, 1)
                    If Not IsEmpty(a(i, c)) And a(i, c) <> 0 Then
                    If Not dic.exists(a(i, c)) Then
                    j = j + 1: w(j, 1) = a(i, c): w(j, k) = a(i, c - 1)
                    dic.Add a(i, c), w
                    End If
                    End If
                    Next
                    k = k + 1
                    Next
                    Set dic = Nothing: Erase a
                    On Error Resume Next
                    Application.DisplayAlerts = 0
                    Sheets("Results").Delete
                    Application.DisplayAlerts = 1
                    With Sheets.Add
                    .Name = "Results"
                    .Cells(1, 1) = "MSA"
                    For i = 1 To l - 1
                    .Cells(1, i + 1) = "SS" & i
                    Next
                    .Cells(2, 1).Resize(j, l).Value = w
                    With .Cells(2, 1).Resize(j + 1, l + 1)
                    .Sort .Range("a2"), xlAscending
                    .Cells(1, l + 1).Formula = "=A2"
                    .Cells(2, l + 1).Resize(j, 1).Formula = sFormula
                    a = .Value
                    End With
                    ReDim t(1 To UBound(a, 1), 1 To l): j = 0: k = 0
                    For i = 1 To UBound(a, 1)
                    If a(i, l + 1) <> 0 Then
                    j = j + 1: For c = 1 To l: t(j, c) = a(i, c): Next
                    Else
                    For m = 2 To l
                    If Not IsEmpty(a(i, m)) And Not IsEmpty(a(i - 1, m)) Then
                    Flg = True: Exit For
                    End If
                    Next
                    If Flg Then
                    j = j + 1: For c = 1 To l: t(j, c) = a(i, c): Next
                    Else
                    For m = 2 To l
                    If Not IsEmpty(a(i, m)) And Not IsEmpty(t(j, m)) Then
                    Flg = True: Exit For
                    End If
                    Next
                    If Flg Then
                    j = j + 1: For c = 1 To l: t(j, c) = a(i, c): Next
                    Else
                    For c = 1 To l
                    If IsEmpty(t(j, c)) Then t(j, c) = a(i, c)
                    Next
                    End If
                    Flg = False
                    End If
                    Flg = False
                    End If
                    Next
                    .Range(.Cells(2, 1), .Cells(2, 1).SpecialCells(xlCellTypeLastCell)).ClearContents
                    .Cells(2, 1).Resize(j, l) = t
                    End With
                    End Sub


                    HTH
                    Kris

                    ExcelFox

                    Comment


                    • #11
                      Re: Merge &amp; Sort, Eliminate Duplicates With Tolerance

                      Kris,
                      I am sorry. This one does not work the way it should. It produces two columns: one with the MSA value and another one with all SS values. I thought the second Sub that you posted was just a small step away from target.
                      Thanks,
                      Julia

                      Comment


                      • #12
                        Re: Merge &amp; Sort, Eliminate Duplicates With Tolerance

                        Hi,

                        It works fine for me. Have a look at the attachment.
                        Attached Files
                        Kris

                        ExcelFox

                        Comment


                        • #13
                          Re: Merge &amp; Sort, Eliminate Duplicates With Tolerance

                          Oops. Sorry. I forgot that I switched to one of my real data files, in which the dataset starts at the H7 cell (and not at a4). In the previous subroutine it was obvious how to change the starting cell but now I am a little lost.

                          I also have one final question. How difficult would it be to sort the KD values by putting a table just next to the table of the SS values?

                          I am sorry for being such a bother and I appreciate your help very much.

                          Comment


                          • #14
                            Re: Merge &amp; Sort, Eliminate Duplicates With Tolerance

                            Hi,

                            ...it was obvious how to change the starting cell but now I am a little lost.
                            Code:
                            n = .Cells(4, Columns.Count).End(xlToLeft).Column 'replace 4 with starting row
                            ..to sort the KD values by putting a table just next to the table of the SS values?
                            Code:
                            Sub KD_Values()
                            Dim a, i As Long, j As Long, dic As Object, w(), n  As Long, z
                            Set dic = CreateObject("scripting.dictionary")
                            With Sheets("Sheet1") '<-adjust the sheet name and range
                                n = .Cells(4, Columns.Count).End(xlToLeft).Column 'replace 4 with starting row
                                a = .Range(.Cells(4, 1), .Cells(.Cells(Rows.Count, 1).End(xlUp).Row, n))
                            End With
                            l = (UBound(a, 2) / 8)
                            For c = 8 To UBound(a, 2) Step 8
                                For i = 1 To UBound(a, 1)
                                    If Not IsEmpty(a(i, c)) Then
                                        If Not dic.exists(a(i, c)) Then
                                            dic.Add a(i, c), Nothing
                                        End If
                                    End If
                                Next
                            Next
                            z = dic.keys: Set dic = Nothing: Erase a
                            If Not Sheets("Results") Is Nothing Then
                                With Sheets("Results")
                                    n = .Cells(1, Columns.Count).End(xlToLeft).Column
                                    .Cells(2, n + 2).Resize(UBound(z)) = Application.Transpose(z)
                                    .Cells(2, n + 2).Resize(UBound(z)).Sort .Cells(3, n + 2), xlAscending
                                End With
                            Else
                                MsgBox "Sheets 'Results' not found!"
                                Exit Sub
                            End If
                            End Sub
                            HTH
                            Kris

                            ExcelFox

                            Comment


                            • #15
                              Re: Merge, Sort &amp; Eliminate Close Duplicates

                              Kris,
                              I think I did not explain well what I meant. Right now the sub generates a table with MSA, SS1 ... SS6. I was wondering how difficult would it be to put sorted values of KD1 ... KD6 on the right hand side of the table.
                              Thanks again,
                              Julia

                              Comment

                              Trending

                              Collapse

                              There are no results that meet this criteria.

                              Working...
                              X