I have a large dataset, i would like to sum 1 column by others using lapply function. But i have problem, the others columns disapeared.. I would like to keep them.
I have an exemple for you :)
Exemple :
I have this dataset:
X Y Z date columnSum
1: A a1 z1 2018.01 4
2: A a1 z1 2018.01 4
2: B a2 z3 2018.02 10
2: B a2 z5 2018.02 30
2: B a2 z5 2018.02 10
3: C a2 z3 2018.02 10
4: D a3 z4 2018.03 0
4: D a3 z6 2018.03 0
I want to sum "columnSum" by "X", "Y" and "date". I want to keep the column "Z"
I tried this:
DT[, lapply(.SD,sum,na.rm=TRUE), .SDcols="columnSum", by=list(X,Y,date)]
Today i have this result:
X Y date columnSum
1: A a1 2018.01 8
2: B a2 2018.02 50
3: C a2 2018.02 10
4: D a3 2018.03 0
I want this RESULT :
X Y Z date columnSum
1: A a1 z1 2018.01 8
2: B a2 z3 2018.02 50
3: B a2 z5 2018.02 50
4: C a2 z3 2018.02 10
5: D a3 z4 2018.03 0
6: D a3 z6 2018.03 0