0
votes

I would like to convert a dataframe into a matrix in R. The dataframe has more than 30 different variables with different types, some are numeric, some factors and some characters. When converting it into a matrix, I would like to keep all types exactly the same as in the dataframe.

I tried converting it with as.matrix(), see code below (this is just a simple example dataframe with only two variables).

test_df <- data.frame(a = c(1:10), b = c(letters[1:10]))
test_df <- as.matrix(test_df)
typeof(test_df[,1])
typeof(test_df[,2])

Column 'a' in the example has type integer while column 'b' has type factor. I expect each column to keep its type when converting a dataframe into a matrix. However, when I convert it into a matrix, all variables are being converted into type character.

1
i'm curious to know why you are pursuing this? a matrix can only have one type of data. there may be another way to address your needs?Ben
Because I have to loop through the dataframe which takes a lot of time because it has more than 4 million rows. Looping through matrices is way quicker I noticed. I was hoping that I could convert the dataframe into a matrix, and then loop through it, before converting it back into a dataframe. But from your answer I understand that this is not possible.Hugovanp
If you show the exact problem with the code you are using, then somebody may be able to help you with a more efficient wayakrun

1 Answers

3
votes

No, you can't do that. In R, a matrix has to be all one type: it is stored as a vector of that type together with an attribute saying how many rows and columns it has.

For efficiency, you're right that matrices are a lot faster than dataframes. Maybe you can split your dataframe into one numeric one and one character one. Most other types can be coerced to those types without much loss.