1
votes

I have 2 questions about labelling throughout an entire dataframe:

I have a cross sectional dataset of patients (each row is a patient), and variables (each column is a variable). The first row is the variable name and the second row is the label. For example BMI in row 1 and Body Mass Index in row 2.

Question1: How do I get R to recognize that the second row is a label, without individually typing each label age=Age and such? There are hundreds of variables that need to be labelled. Maybe during IMPORT somehow? Or by separating the labels to a different data frame? I cannot seem to find a solution other than typing it individually for each variable or putting it into a separate dataset with just variable names and labels and using match from R: Assign variable labels of data frame columns

library(Hmisc)

var.labels = dat2

label(data) = as.list(var.labels[match(names(data), names(var.labels))])

label(data)
                     age                      sex 
          "Age in Years" "Sex of the participant"   

Question 2: If all 0 values are "no" in my data and all "1" values are yes, how can I label all values of 0 as "no" and all 1 values as "yes"? I haven't found any code for this other than the individual labelling.

Many thanks in advance!!!

Here is a mini version of what it looks like: dput: structure(list(patient = c("Patient", "T1", "T2", "T3", "T4", "T5", "T6", "T7", "T8", "T9", "T10"), variablename1 = c("Variable Label 1", "2", "1", "4", "2", "2", "1", "1", "1", "1", "1"), variablename2 = c("Variable Label 2", "3", "1", "2", "2", "2", "2", "1", "2", "1", "1")), row.names = c(NA, -11L), class = c("tbl_df", "tbl", "data.frame"))

1
Please show us what your data looks like. Use dput() and paste the contents of that file into your question so that we can try some things to help you out. Also, paste in any code you may have tried. Read more about providing a minimal reproducible example.Ben Norris
Looking at your sample data it seems that you have read the data incorrectly. Your headers have become the first row. It would also be helpful if you could show expected output for the example shared.Ronak Shah

1 Answers

0
votes
library(tidyverse)

string <-
"Body mass index, Age, Answer1, Answer2
BMI, Age, Answer1, Answer2
20, 27, 1, 0
29, 42, 1, 1"

# reading in column names (both short and long)
df_names <- read_csv(file = string, n_max = 2, col_names = F)

# reading in values
df_values <- read_csv(file = string, skip = 2, col_names = F) %>%
  mutate(across(-(X1:X2), ~if_else(.x == 1, "yes", "no"))) # replacing 1 with yes and 0 with no

names(df_values) <- as.character(df_names[1,]) # assigning long names as names
names(df_values) <- as.character(df_names[2,]) # assigning short names as names