2
votes

I'm aware there are many posts on this already. I promise that I have looked at them. Nevertheless I'm struggling.

Below is a dput list which is the output of a call to lapply.

I wouldlike a nice, easy to read data frame with 2 columns, one for true and one for false, with a row for each of the 25 list items.

Tried:

 falsies <- lapply(my_list, function(x) table(tolower(x) %in% c("", "unknown", "\\?"))) %>% 
+   data.frame(do.call(rbind, .))

Error in data.frame(., do.call(rbind, .)) : arguments imply differing number of rows: 2, 25

falsies <- lapply(my_list, function(x) table(tolower(x) %in% c("", "unknown", "\\?"))) %>% 
  as.data.frame.matrix()

Error in seq_len(ncols) : argument must be coercible to non-negative integer In addition: Warning message: In seq_len(ncols) : first element used of 'length.out' argument

falsies <- lapply(my_list, function(x) table(tolower(x) %in% c("", "unknown", "\\?"))) %>% as.vector(t(.)) %>% 
  as.data.frame(Field = names(.), Value = unlist(.))

Error in as.vector(x, mode) : invalid 'mode' argument

How can I convert my list into a 2 feature wide data frame?

my_list <- structure(list(ID = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), Fiscal_Week_Date = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), FISCAL_WEEK = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), SU_CURRENT_RECORD_IND = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), PROFIT_CENTRE = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), ACTIVE_ON_BASE = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), SU_STATUS_ID = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), SU_BIRTH_DATE = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), SU_GENDER = structure(c(17193L, 
13899L), .Dim = 2L, .Dimnames = structure(list(c("FALSE", "TRUE"
)), .Names = ""), class = "table"), AVERAGE_SPEND = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), CU_PAPERLESS_BILL_IND = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), SU_FIXED_MOBILE_IND = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), MMS_INDICATOR = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), INSURANCE_INDICATOR = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), INSURANCE_AMOUNT = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), PREFERRED_TOPUP_METHOD_DESC = structure(c(7672L, 
23420L), .Dim = 2L, .Dimnames = structure(list(c("FALSE", "TRUE"
)), .Names = ""), class = "table"), BROADBAND_IND = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), ICT_IND = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), TENURE_IN_MONTHS = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), CONTRACT_TYPE = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), HA_DEVICE_CAPABILITY = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), Year = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), Week = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), Age = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table"), Target_New_Card = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
    "FALSE"), .Names = ""), class = "table")), .Names = c("ID", 
"Fiscal_Week_Date", "FISCAL_WEEK", "SU_CURRENT_RECORD_IND", "PROFIT_CENTRE", 
"ACTIVE_ON_BASE", "SU_STATUS_ID", "SU_BIRTH_DATE", "SU_GENDER", 
"AVERAGE_SPEND", "CU_PAPERLESS_BILL_IND", "SU_FIXED_MOBILE_IND", 
"MMS_INDICATOR", "INSURANCE_INDICATOR", "INSURANCE_AMOUNT", "PREFERRED_TOPUP_METHOD_DESC", 
"BROADBAND_IND", "ICT_IND", "TENURE_IN_MONTHS", "CONTRACT_TYPE", 
"HA_DEVICE_CAPABILITY", "Year", "Week", "Age", "Target_New_Card"
))
1
So the final output you want as 25 X 2 dataframe with 2 columns TRUE and FALSE, for the rows where value is not present you want to keep it as blank, right?Ronak Shah
Yes that's correctDoug Fir

1 Answers

0
votes

There are a variety of ways to do this, but recognize that the output you requested will not be tidy, and so not a typical or best practice data frame. The primary challenge here is that your list is comprised of tables, with one of the elements being a table of FALSE and TRUE, and all of the others being a table of FALSE only. Just the FALSE values contain all the information, but you can have your data in whatever form works for you :)

Here we don't assume ID.FALSE contains all the false ids, but we use the one element of my_list with both TRUE and FALSE values to compute the total. Then we change that element so that it is in a compatible form, convert to a data.frame, add in the TRUE values, and voila!

total <- sum(my_list$PREFERRED_TOPUP_METHOD_DESC)
my_list$PREFERRED_TOPUP_METHOD_DESC <- my_list$PREFERRED_TOPUP_METHOD_DESC["FALSE"]
DF <- as.data.frame(unlist(my_list))
DF[2] <- total - DF[1]
names(DF) <- c("FALSE", "TRUE")
head(DF)
#                             FALSE TRUE
# ID.FALSE                    31092    0
# Fiscal_Week_Date.FALSE      31092    0
# FISCAL_WEEK.FALSE           31092    0
# SU_CURRENT_RECORD_IND.FALSE 31092    0
# PROFIT_CENTRE.FALSE         31092    0
# ACTIVE_ON_BASE.FALSE        31092    0

# a helpful pair of rows to convince yourself this worked
 DF[c("SU_GENDER.FALSE", "SU_GENDER.TRUE"), ]
#                 FALSE  TRUE
# SU_GENDER.FALSE 17193 13899
# SU_GENDER.TRUE  13899 17193