In the Mllib version of Random Forest there was a possibility to specify the columns with nominal features (numerical but still categorical variables) with parameter categoricalFeaturesInfo
What's about the ML Random Forest? In the user guide there is an example that uses VectorIndexer that converts the categorical features in vector as well, but it's written "Automatically identify categorical features, and index them"
In the other discussion of the same problem I found that numerical indexes are treated as continuous features anyway in random forest, and it's recommended to do one-hot encoding to avoid this, that seems to not make sense in the case of this algorithm, and especially given the official example mentioned above!
I noticed also that when having a lot of categories(>1000) in the categorical column, once they are indexed with StringIndexer, random forest algorithm asks me setting the MaxBin parameter, supposed to be used with continuous features. Does it mean that the features more than number of bins will be treated as continuous, as it's specified in the official example, and so StringIndexer is OK for my categorical column, or does it mean that the whole column with numerical still nominal features will be bucketized with assumption that the variables are continuous?