0
votes

I read that it is recommended to use feature selection after feature extraction process.

But there is something missing in all the posts I read:

  • Suppose we have 50 features.

  • Suppose we use feature extraction and we got 3 new extraction features

2 questions:

  1. Do we need to run the feature selection on:

    • The 3 extracted features ?

    or

    • total 53 features (base 50 features + 3 extracted features) ?
  2. Suppose we run feature selection on the result of feature extraction and not all the new extracted features where chosen (i.e 2 out of 3) by feature selection algorithm, it seems that the output of feature extraction was not good enough if we can drop one of the 'new' dimension ?

1

1 Answers

1
votes
  1. We run feature_selection on the full original feature set to select the most useful features.

    If you used dimensionality reduction to find 3 additional features, then your feature set will consist of all the 53 features, hence you will run feature selection on the 53 features.

  2. This seems more like a statement than a question. If you mean, we can re-run feature extraction (dimensionality reduction) with 1 less feature as one of the features was not important. No, you can't make this assumption that re-running the extraction with 2 features will ensure that both of them are being selected. It all depends on what dimensionality reduction technique is being used, what is the feature selection scheme, the dataset, etc.