I am looking to gain some insight into my data. I am converting them into VSM using sklearn PCA and plotting them to a matplotlib graph. THis involves
Casting the documents to a number matrix using pipeline
test = pipeline.fit_transform(docs).todense()
Fitting it to my model
pca = PCA().fit(test)
Then I am converting it using transform
data = pca.transform(test)
Finally I am plotting the results using Matplotlib
plt.scatter(data[:,0], data[:,1], c = categories)
My question is this: How do I take new sentences and determine where they would lie in relation to the other documents plotted. Using an X to mark their relative positions ?
Thanks