I have the below dataset:
Var1 Var2 Var3 Var4
1 Rank 1 Sub 1 0 10
2 Rank 1 Sub 1 0 20
3 Rank 2 Sub 2 0 30
4 Rank 1 0 Sub 1 40
5 Rank 2 Sub 2 0 50
6 Rank 2 0 Sub 2 10
I want to remove the rows that have the least values based on Var2 and Var3. For example, Rank 1 (in Var1) has 2 values in Var2 and 1 value in Var3. I want to remove all entries of Rank 1 that have a value in Var3 and keep all entries that have a value in Var2. The same applies for all other Var1 values.
So the final result will be:
Var1 Var2 Var3 Var4
1 Rank 1 Sub 1 0 10
2 Rank 1 Sub 1 0 20
3 Rank 2 Sub 2 0 30
4 Rank 2 Sub 2 0 50
Is there a way to do that? find the code to build the above table below:
Var1 = c("Rank 1", "Rank 1", "Rank 2", "Rank 1", "Rank 2")
Var2 = c("Sub 1", "Sub 1", "Sub 2","0", "Sub 2")
Var3 = c(0, "Sub 1", 0, "Sub 1", "0" )
Var4 = c(10,20, 30, 40,50)
df <- data.frame(Var1,Var2,Var3,Var4)
PS: This will be a very large dataset with multiple entries in both Var2 and Var3
Thanks
Var1you want to keep those values which has more non zero values inVar2orVar3? - Ronak Shahdata.frameare not same. One got 6 rows and another got 5 rows. - MKR