3
votes

This should be something easy but I cannot seem to get it right.

I have a data table with N columns (let's say N=40K) and two character vectors with the same length (i.e. labelvector and unitvector) and I would like to add the attributes "label" and "units" to each column of the data.table to the value indicated by the relevant vectors for that column.

Both vectors are also named, using the data.table column names.

My efforts revolved around using setattr to all columns or including the .SD notation with lapply, which I use as a main workhorse when I have large tables but without any significant success.

The latter failed because I could not access the name of the column being passed to the function call from within lapply, in order to set the attributes by reference.

I can either make a function that sets the attributes by reference (having := data.table call within the function body) or an *apply/for loop that sets them but both take a lot of time.

Do you think that this can be done faster or more elegantly?

* Edit*****

Example:

the table has 4 columns: Age, Hgt, Wgt and S

labelvector has 4 values: "Age", "Height", "Weight" and "Sex".

unitvecor has also 4 values: "Years", "cm", "kg", NA.

both labelvector and unitvector values are named with table column names.

So the goal is to set for data table:

Column Age, label: "Age", units "Years".

Column Hgt, label: "Height", units "cm".

Column Wgt, label: "Weight", units "kg".

Column S, label: "Sex", units NA.

This has to be generalized to a data.table of tens of thousands of columns.

2
Hi. can you please explain how you achieved this? I have the same issue where I would like to assign column attributes across multiple columns in an efficient way. - alaj

2 Answers

2
votes

This is going to fix your issue

  attr(temp_data, "names") <- c("label", "units")

Where temp_data is your data frame

0
votes

I believe that is what are you looking for

mapply(setattr, x = temp_data, name = "names", value = names(temp_data), SIMPLIFY = FALSE)