I have a dataframe that I would like to reduce in size by extracting the unique observations. However, I would like to only select the unique observations of one column, and preserve the rest of the dataframe. Because there are certain other columns that have repeat values, I cannot simply put the entire dataframe in the unique
function. How can I do this and produce the entire dataframe?
For example, with the following dataframe, I would like to only reduce the dataframe by unique observations of variable a (column 1):
a b c d e
1 2 3 4 5
1 2 3 4 6
3 4 5 6 8
4 5 2 3 6
Therefore, I only remove row 2, because "1" is repeated. The other rows/columns repeat values, but these observations are maintained, because I only assess the uniqueness of column 1 (a).
Desired outcome:
a b c d e
1 2 3 4 5
3 4 5 6 8
4 5 2 3 6
How can I process this and then retrieve the entire dataframe? Is there a configuration for the unique
function to do this, or do I need an alternative?
df[!duplicated(df$a), ]
- Adam Quek