0
votes

I have the following data.table.

    dat <- structure(list(kmers = c("TTTTTTTTTTTT", "TCCATTCCATTC", "TTCCATTCCATT", 
"CCATTCCATTCC", "ATTCCATTCCAT", "CATTCCATTCCA", "TTTTATTATTTT", 
"AAAATTATAAAA", "AAGACAATTTCT", "AAAGACAATTTC"), counts = c(16361L, 
10090L, 9599L, 9021L, 8516L, 8325L, 5739L, 5642L, 5378L, 5326L
)), .Names = c("kmers", "counts"), class = c("data.table", "data.frame"
), row.names = c(NA, -10L), .internal.selfref = <pointer: 0x29f1d78>)

This is the table

           kmers counts
 1: TTTTTTTTTTTT  16361
 2: TCCATTCCATTC  10090
 3: TTCCATTCCATT   9599
 4: CCATTCCATTCC   9021
 5: ATTCCATTCCAT   8516
 6: CATTCCATTCCA   8325
 7: TTTTATTATTTT   5739
 8: AAAATTATAAAA   5642
 9: AAGACAATTTCT   5378
10: AAAGACAATTTC   5326

I would like to divide the column counts by the sum of all counts. For a dataframe i would do

total=sum(dat$counts)
freq <-  dat$counts/total

How can i do that for the data.table ? Each kmers is unique so i do not expect to have duplicated values in kmers column.

For example for first row it will be 16361/sum(dat$counts).

1
dat[, freq := counts / sum(counts)]David Arenburg
Thanks for the update!!!david

1 Answers

0
votes

Or using normal base syntax still works:

dat$countProportion = dat$counts / sum(dat$counts)