Remove extra rows from R data frame where values in a column is repeated

Question

Sorry really don't know how to come up with a better title for this question...

Part 1:

Anyway, I have a data frame in R that looks like so:

   species       date
1        a 2015-11-10
2        a 2015-11-10
3        a 2015-11-10
4        b 2015-11-10
5        a 2015-11-11
6        b 2015-11-11
7        a 2015-11-12
8        a 2015-11-12
9        c 2015-11-12
10       c 2015-11-12
11       a 2015-11-13
12       a 2015-11-13
13       b 2015-11-13
14       b 2015-11-13
15       c 2015-11-13

This is basically a record of the animal species that I encountered on each day. Within a date, sometimes a species appears more than once, this is because I saw it more than once on that day.

Now, I would like to remove the extra sightings of the same animal within the same date so I end up with a data frame that looks like this:

  species       date
1       a 2015-11-10
2       b 2015-11-10
3       a 2015-11-11
4       b 2015-11-11
5       a 2015-11-12
6       c 2015-11-12
7       a 2015-11-13
8       b 2015-11-13
9       c 2015-11-13

How do I achieve this? As a very new R user I can't figure this out at all... :(

BTW, I actually have more columns in the real data frame that are not related to the question, but I would like to keep those columns. Also, I want to ensure that R would treat the dates in the date column as date data objects instead of strings or whatever.

Part 2:

With the data frame from the end of Part 1, I'd like to convert that into a data frame like this:

           a b c
2015-11-10 1 1 0
2015-11-11 1 1 0
2015-11-12 1 0 1
2015-11-13 1 1 1

The 1s and 0s represent essentially yes and no (but I'd like to keep them as integers). So this new data frame simply records whether I have seen a particular animal species on a given date. And for this I'd also like to have the dates (in the first column from the left) treated as date data types in R. How do I do this? Please note that I have many more species than just a, b, c. So the solution will have to dynamically adjust to the number of species that I actually saw.

Thank you for your help!

Do a search in SO. Likely solutions involve the duplicated and ave functions. I'd be surprised if you couldn't find more than one prior answer to this question. — IRTFM
The general idea is to aggregate your data, and then pivot it out. dplyr is my go-to package of choice for aggregation. The pivoting might be trickier depending on what the rest of the data looks like, but that's the overall approach I think you'll need to take (based on what I see). — Sevyns

akrun akrun · Accepted Answer · 2015-11-25T21:42:36

We could use unique to get the unique rows and then use table to get the count.

Un1 <- unique(df1)
(table(Un1[2:1])> 0L) + 0L

EDIT: Based on @thelatemail's comment.

Remove extra rows from R data frame where values in a column is repeated

1 Answers