I have a dataset that has 8 mixed features (6 numeric and 2 categorical). Since the numeric values have different ranges, I will have to normalize the dataset as a whole to be able to perform farther actions such as machine learning algorithms, Dimensionality reduction (Feature Extraction).
My original dataset:
time v1 v2 v3 ... v7 v8
00:00:01 15435 0.7 13 ... High True
00:00:06 24356 3.6 23 ... High True
00:00:11 25567 8.3 82 ... LOW False
00:00:16 12345 5.4 110 ... LOW True
00:00:21 43246 1.7 93 ... High False
................................................
23:23:59 23456 3.8 45 ... LOW False
where v1 to v6 are numerical variable at which their values are at different ranges as it can be seen above. Moreover, v7 and v8 are categorical variables that has only two outputs (for v7 {High, Low} and for v8 {True, False}).
I did label encoding for the categorical variables (v7 and v8) where High and True were encoded 1 and LOW and False were encoded 0.
The following illustrate how the dataset looks like after the label encoding:
time v1 v2 v3 ... v7 v8
00:00:01 15435 0.7 13 ... 1 1
00:00:06 24356 3.6 23 ... 1 1
00:00:11 25567 8.3 82 ... 0 0
00:00:16 12345 5.4 110 ... 0 1
00:00:21 43246 1.7 93 ... 1 0
................................................
23:23:59 23456 3.8 45 ... 0 0
My question is as follows: It is easy to standardize the numerical features from v1 to v6. However, I am not sure whether to standardize the categorical observations and if yes what would be the best way to do so??