What does fit, transform, and fit_transform do in PCA available in sklearn.decomposition?

Question

I am trying to mimic the behavior of PCA class available in sklearn.decomposition.

I have wrote a method which computes the SVD but I am not sure what does fit(), tranform(), and fit_transform() do without which I'm not able to proceed further.

I think fit() computes the svd and the singular values can be accessed using the singular_values_ attribute but I don't know about the remaining two methods.

What classes do you inherit from? Only BaseEstimator and TransformerMixin, or another one? — Arne
I haven't inherited anything. I'm trying to build from scratch using only basic numpy functions. — Akhilesh Pandey
If you want to use an estimator or transformer in a sklearn context (in a pipeline, in a gridsearch..) you need to inherit those mixins. This article gives lots of info about how to implement parts of the data pipeline yourself. — Arne

Tim Tim · Accepted Answer · 2018-09-12T11:14:32

In the docs you can see a general explanation of fit(), transform(), and fit_transform():

[...] a fit method, which learns model parameters (e.g. mean and standard deviation for normalization) from a training set, and a transform method which applies this transformation model to unseen data. fit_transform may be more convenient and efficient for modelling and transforming the training data simultaneously.

What does fit, transform, and fit_transform do in PCA available in sklearn.decomposition?

1 Answers