I am writing a function which re-write the column names in order to output the data.table in a standard. The inputs are user-provided data.tables which might differ by several names.
Here is the format which should be the output of all input data.tables:
length width height weight
The input data.tables may look like, e.g.
input_dt = data.table(
length = 194,
wide = 36,
tall = 340,
kilogram = 231.2
)
My function would take this data.table (or data.frame) as an input, and change the columns, outputting this data.table:
length width height weight
194 36 340 231.2
I've created a key
for the function which would check possible names:
key = list(
length = c('long'),
width = c('girth', 'WIDTH', 'wide'),
height = c('tall', 'high'),
weight = c('Weight', 'WEIGHT', 'kilogram', 'pound', 'kilograms', 'pounds')
)
Now, within the function, I can check the input column names of input_dt
to check whether they need to be changed by checking the intersection:
> intersect(names(input_dt), unlist(key))
[1] "wide" "tall" "kilogram"
And then change these appropriately. My question is:
Writing this custom function would be full of for-loops, and quite inefficient. Are there other more data.table-friendly solutions available, given a custom "key" of values?