I have created a feature vector (data.frame) that has an id, feat1, feat2, feat3, boolean, but in this data frame there are duplicates of ids, which is done purposefully. What I want to do is as I iterate over this data frame build new data frame per id.
For simplicity lets assume I have following two columns.
X1 X2 X3
1 000000001 -1.4061361 1
2 000000001 -0.1973846 1
3 000000002 -0.4385071 1
4 000000001 -0.6593677 0
5 000000001 -1.2592415 0
6 000000001 -0.5463655 1
7 000000002 0.4231117 0
8 000000002 -0.1640883 1
9 000000002 0.7157506 0
10 000000002 2.3234110 1
I want to build different data frame based on X1 basically I want to get all the same X1 into their own data frames. I wrote using multiple for loops but It takes super long time since this is a large data set. What is the best way to do this?
by(). - Ferdinand.kraftby()accepts any function that works on a dataframe chunk and returns summarized data. - Ferdinand.kraft