0
votes

I have a data.frame with 90k rows named "sourceToDestination".

enter image description here

Many of these rows in the data.frame above are duplicated. Using the unique command I created another data.frame which lists only the unique rows from the above data.frame and named it "sourceToDestinationUnique".

enter image description here

Now in this data.frame which shows unique values, I want to add another column at the very end which lists count. And the count column specified how many times each of these unique rows appear in the original data.frame.

I tried using the command below to check how many times row 1 in unique data.frame is present in original data.frame:

> sourceToDestinationUnique[1,] %in% sourceToDestination

But it gives me this strange answer:

[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Kindly let me know which command to use? Thanks.

1

1 Answers

1
votes

I'd suggest another way that can archive your purpose:

 sourceToDestinationUnique <- aggregate(list(dupCount=rep(1,nrow(sourceToDestination))), sourceToDestination, length)

Let's print out the df sourceToDestinationUnique to see the result.