4
votes

I have an excel spreadsheet with two columns. The first column is a label the second column is a numeric value. I would like to remove all the duplicate labels in column "A" and remain with the maximun numeric value in column "B". I've tried to illustrate below (the filter would result in "Consolidated Sheet" given "Original Sheet":

NOTE 1: I'm trying to parse mail server log files to get a listing of hosts and their maximum SMTP connections on any particular day. If you can suggest a way of doing this on windows (even if it means using Strawberry Perl) I'd appreciate the solution.

NOTE 2: You might have to use your imagination with the "spreadsheet" I've diagrammed below, I tried putting each row in an individual line but all rows are rendering in a single line. [EDIT] Fixed it by putting everything in an unordered list.

"Original Sheet"

  • colA, colB
  • name1, 32.0,
  • name1, 12.5,
  • name1, 11.0,
  • name1, 10.1,
  • name1, 6.9,
  • name2, 2.3,
  • name2, 1.1,
  • name2, 0.9,
  • name3, 54,
  • name3, 60,
  • name3, 101,
  • name3, 12,

"Consolidated Sheet"

  • colA, colB
  • name1, 32.0,
  • name2, 2.3,
  • name3, 101,

Regards, emk

2

2 Answers

5
votes

Could you not just use Excel's Pivot Table functionality?

0
votes

You could use Advanced Filter with the "unique records only" checkbox selected and copy unique values to a new location.

Once you have the unique values, you can use an array function to find the maximum value.

Let's say your original data is in a2:a13 and your unique values are in d2:d4 (ignoring column headings). Your formula in e2 would be =SUMPRODUCT(MAX(($A2:$A13=D2)*($B2:$B13)))

Array functions have to be entered into the cell with Ctrl+Shift+Enter, instead of just Enter.

However, as JDunkerley says, a pivot table would be your easiest and fastest method though.