0
votes

In Ordinary Least Square Estimation, the assumption is for the Samples matrix X (of shape N_samples x N_features) to have "full column rank".

This is apparently needed so that the linear regression can be reduced to a simple algebraic equation using the Moore–Penrose inverse. See this section of the Wikipedia article for OLS: https://en.wikipedia.org/wiki/Ordinary_least_squares#Estimation

In theory this means that if all columns of X (i.e. features) are linearly independent we can make an assumption that makes OLS simple to calculate, correct?

What does this mean in practice? Does this mean that OLS is not calculable and will result in an error for such input data X? Or will the result just be bad? Are there any classical datasets for which linear regression fails due to this assumption not being true?

1

1 Answers

0
votes

The full rank assumption is only needed if you were to use the inverse (or cholesky decomposition, or QR or any other method that is (mathematically) equivalent to computing the inverse). If you use the Moore-Penrose inverse you will still compute an answer. When the full rank assumtion is violated there is no longer a unique answer, ie there are many x that minimise

||A*x-b||

The one you will compute with the Moore-Penrose will be the x of minimum norm. See here, for exampleA