I'm trying to use a saved Mllib model to predict sentiment on live streaming data.
I've tried all the recommendations I have found but still I get errors. Current error :Field "features" does not exist.
The schema of trained data is
root
|-- label: double (nullable = true)
|-- words: array (nullable = true)
| |-- element: string (containsNull = true)
|-- features: vector (nullable = true)
lines = spark\
.readStream\
.format("kafka")\
.option("kafka.bootstrap.servers", bootstrapServers)\
.option("subscribe", topics)\
.load()\
.selectExpr("CAST(value AS STRING)")
#<class 'pyspark.sql.dataframe.DataFrame'>
read_data=lines.selectExpr("CAST(value AS STRING) as text")
model_nb = NaiveBayesModel.load("./myNBmodel")
prediction = model_nb.transform(read_data)
print(prediction.schema)
query1 = prediction.writeStream \
.outputMode("update") \
.foreach(process_row) \
.start()
query1.awaitTermination()
prediction = model_nb.transform(read_data)
:Py4JJavaError: An error occurred while calling o133.transform. : java.lang.IllegalArgumentException: Field "features" does not exist. Available fields: text
Fetced data don't need features in order to have a prediction, right?