0
votes

I have event tracking data in Excel that is a bit like this.

Event Name Email Attend 1 Attend 2 Attend 3
Event 1 Joe Joe@email Yes
Event 2 Joe Joe@email No
Event 1 Bob Bob@email No
Event 1 Sara Sara@email Yes
Event 2 Sara Sara@email Yes
Event 3 Sara Sara@email Yes
Event 2 Ray Ray@email Yes
Event 3 Ray Ray@email No

I am trying to combine / collapse the data so that each Name & Email is a unique line (using email as the unique identifier) and merge the Attend row data. The blank cells are also important for later work (person did not register for the event). The end data should look like this:

Name Email Attend 1 Attend 2 Attend 3
Joe Joe@email Yes No
Bob Bob@email No
Sara Sara@email Yes Yes Yes
Ray Ray@email Yes No

I've tried things like adding helper columns to ID duplicate rows, filtering, and various lookup versions, but I keep getting stuck. Most solutions I find online showed concatenating data, using groups, or pivot tables that don't get the end result I need or used commercial add-ons like AbleBits. I saw a similar sounding post on SO (link) but it didn't quite get at my situation and the solution didn't seem to work for me (unless I was not doing it right). Is there a way to use Excel formulas or regular menu options to obtain my result? If there are more complex solutions, e.g. VBA or Power Query, please provide detailed steps as I'm not familiar with those tools. I could likely export the data and resolve it in R, but I'm looking for a native Excel solution.

Thanks!

1
Create a list of unique values for the names and e-mails (Data -> Data Tools - Remove Duplicates), then use VLOOKUP or INDEX/MATCH to populate the other columns. Let us know if you run into any problems.cybernetic.nomad
Thanks! I would work. I took a wrong approach trying to filter duplicates rather than list unique.Shawn Janzen

1 Answers

1
votes

You can do =UNIQUE(C1:C100) in a separate column and copy and paste those values over themselves (assuming the range of emails is in C2:C100 and that the headers are in the first row).

Then in an adjacent column do something like =CONCAT(FILTER(D$2:D$100, $C$2:$C$100=$H2)) and drag down and to the right (where column H contains the column of unique emails). Then simply copy and paste values over themselves again and remove the old columns.

enter image description here