0
votes

I want to train a SVM classifier in MATLAB for threat detection. The training data is in Excel file and contains both numeric and text fields/columns. When I export this data to MATLAB, it is either in table or cell format. How do I convert it in matrix format?

P.S: Using xlsread function does not import text data.

2
You could calculate numeric features from that data (both text and numeric). Feed those features into SVM. You have to determine what would be the best way to do this. I have never done this, so its just a suggestion. - Autonomous

2 Answers

0
votes

There are 4 type of attributes in data. Numerical ,discrete , nominal and ordinal. Here you can read more about them . First run an statistical analysis for each feature in your dataset to know the basic statistics such as mean, median, max , min , variable type and if it like nominal or ordinal distinct words and all. So you then have a pretty good idea what you are dealing with.Then according to the variable type you can decide which vectorization we are using.if it is an numerical variable you can divide it into different classes and feature scaling . if it an ordinal variable you can give logical order . if it is nominal variable you can give a identical numerical names. Here , you are just checking how much each feature bring the impact to final prediction

My advice , use Weka GUI too to visualize the data. Then you can pre process the data with column by column

0
votes

You need to transform your text fields into numeric using dummy variables or another technique, or drop them entirely if they actually are id's (e.g. patient name for medical data, record number, respondent uuid for a survey, etc.)

This would actually be easier in R or Python+Pandas, but in Matlab, you will need to perform encoding by yourself, working from the cell array towards a matrix. Or you can try this toolbox.