Subsetting a dataframe

Question

I have a dataframe with 23000 rows and 8 columns

I want to subset it using only unique identifiers that are in column 1. I do this by,

total_res2 <- unique(total_res['Entrez.ID']);

This produces 17,000 rows with only the information from column 1.

I am wondering how to extract the unique rows, based on this column and also take the information from the other 7 columns using only these unique rows.

G. Grothendieck G. Grothendieck · Accepted Answer · 2014-04-05T13:29:29

This returns the rows of total_res containing the first occurrences of each Entrez.ID value:

subset(total_res, ! duplicated( Entrez.ID ) )

or did you mean you only want rows whose Entrez.ID is not duplicated:

subset(total_res, ave(seq_along(Entrez.ID), Entrez.ID, FUN = length) == 1 )

Next time please provide test data and expected output.

Subsetting a dataframe

1 Answers