Overview
To avoid memory issue, I have converted document term matrix to sparse matrix with “matrix” package using below piece of code:
library(matrix)
documentTermMatrixFrame <- Matrix(documentTermMatrixFrame, sparse = TRUE)
but when I try to use this matrix as an input to ranger() function of “ranger” package using below code:
library(ranger)
trainSet <- documentTermMatrixFrame[1:750,]
testSet <- documentTermMatrixFrame[751:999,]
fit <- ranger(trainingColumnNames, data=trainSet,write.forest=TRUE)
I am getting error:
Error in as.data.frame.default(data) :
cannot coerce class "structure("dgCMatrix", package = "Matrix")" to a data.frame
Dataset
This is a sample of dataset which I am using
<html>
<table style="width:100%">
<tr>
<th>nitemid</th>
<th>sUnSpsc</th>
<th>productDescription</th>
</tr>
<tr>
<td>7460893</td>
<td>26121609Network cable </td>
<td>Category 6A, Advanced MaTriX, 4-pair, 23 AWG, U/UTP copper cable, Plenum (CMP) Rated, White, 1000ft/305m ""</td>
</tr>
<tr>
<td>7460456</td>
<td>26121709Network cable </td>
<td>Shielded marine MUD-resistant armored copper cable, category 7 S/FTP, low smoke zero halogen (LSZH), 4-pair, conductors are 22 AWG construction with foamed PE insulation, twisted in pairs</td>
</tr>
<tr>
<td>7460856</td>
<td>26121890Inter connect cable </td>
<td>1 PC. = 100 M 2 X 1.5 QMM, 100M SPECIAL DESIGN TO UL CLASS 2 YELLOW TPE OIL-RESISTANT AS-INTERFACE SHAPED CABLE</td>
</tr>
</html>
After preprocessing the description in dataset using stopword removal, punctuation removal,stemming etc... document-term matrix will be created which is in turn converted to sparse matrix.
sample of Documnent-term matrix for Dataset
terms
doc advance category ..... ..... ....... ....... ....... twist
1 1 1 0
2 0 1 1
3 0 0 0
Question
how to use sparse matrix as an input to ranger() function ?
Could anyone please help
Thanks in Advance