3
votes

This question follows from a previous question. Instead of having two columns, what if we have three or more columns? Consider the following data.

x <- c(600, 600, 600, 600, 600, 600, 600, 600, 600, 800, 800, 800, 800, 800, 800, 800, 800, 800,
       600, 600, 600, 600, 600, 600, 600, 600, 600, 800, 800, 800, 800, 800, 800, 800, 800, 800,
       600, 600, 600, 600, 600, 600, 600, 600, 600, 800, 800, 800, 800, 800, 800, 800, 800, 800)

y <- c(1,  1,  1,  1,  1,  1,  1, 1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
       80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
       3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3)

z <- c(1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3,
       1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3,
       1, 2, 3, 1, 2, 3)

xyz <- data.frame(cbind(x, y, z))

If we treat all columns as factor with finite number of levels. What I want to get is the number of observations in each unique combination of x, y and z. The answer is 18 unique combinations with 3 observations in each combination. How can I do this in R, please? Thank you!

2
you can try unique(xyz)HubertL
@HubertL Sure. That gives unique combinations. But I also what to know how many observations there are in each unique combination. Is there an easy way to do so, please?LaTeXFan

2 Answers

4
votes

Using table or tabulate with interaction

tabulate(with(xyz, interaction(x,y,z)))

table(with(xyz, interaction(x,y,z)))

or split by the interaction and use lengths,

lengths(split(xyz, with(xyz, interaction(x,y,z))))

or

aggregate(seq_along(x)~ x+y+z, data=xyz, FUN=length)
1
votes

An option using data.table. We convert the 'data.frame' to 'data.table' (setDT(xyz), grouped by the columns of 'xyz', get the number of elements in each group (.N)

library(data.table)
setDT(xyz)[, .N, names(xyz)]$N
#[1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

Or with dplyr, we group by the columns, get the number of elements (n()) using summarise.

library(dplyr)
xyz %>%
    group_by_(.dots=names(xyz)) %>%
    summarise(n=n()) %>%
    .$n
#[1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3