I have the following R data.table:
library(data.table)
dt =
unique_point biased data_points team groupID
1: up1 FALSE 3 1 xy28352
2: up1 TRUE 4 22 xy28352
3: up2 FALSE 1 4 xy28352
4: up2 TRUE 0 3 xy28352
5: up3 FALSE 12 5 xy28352
6: up3 TRUE 35 7 xy28352
....
I've formatted the data.table such that for each unique_point, I am measuring the data points for unbiased and biased. So each unique_point has two rows, biased FALSE and biased TRUE. If there are no measurements, this is recorded as 0.
As an example, for up1, there are 3 data points for the unbiased experiment, and 4 data points for the biased experiment.
Each groupID has 25 teams, each with potentially with a measurement for biased and unbiased. I would like to re-format the data.table so it calculates the number of data points by team as well, for each unique data points (due to the data, this will make rows have data_points of 0).
unique_point biased data_points team groupID
1: up1 FALSE 3 1 xy28352
2: up1 TRUE 0 1 xy28352
3: up1 FALSE 0 2 xy28352
4: up1 TRUE 0 2 xy28352
5: up1 FALSE 0 3 xy28352
6: up1 TRUE 0 3 xy28352
....
45. up1 TRUE 4 22 xy28352
....
49. up1 FALSE 0 25 xy28352
50. up1 TRUE 0 25 xy28352
This task is very close to somehow "unfolding" the data.table. For each unique_point, I would create 50 rows, 25 teams with TRUE and FALSE. The added complication is that I need to use the counts above to fill in the above with the counts.
There should be a way to use unique() to count the times the rows exist possibly?
If I try
setkey(dt, team, unique_point)[CJ(unique(unique_point), unique(team)), .N, by=.EACHI]
I am counting the number of rows which occur for unique_point and team. But this wouldn't keep the data_points.