How do I get the mapping out of a trained Spark MLlib StringIndexerModel?
val stringIndexer = new StringIndexer()
.setInputCol("myCol")
.setOutputCol("myColIdx")
val stringIndexerModel = stringIndexer.fit(data)
val res = stringIndexerModel.transform(data)
The code above will add a myColIdx
to my DataFrame mapping values in myCol
to an index based on the values frequency. i.e. Most frequent value -> 0, second most frequent -> 1, etc...
How do I retrieve that mapping from the model? If I serialize/deserialize the model, will the mapping be stable (i.e. Am I guaranteed to same result after the transform)?