I am doing a quantile regression on the engel dataset with rpy2 (2.7.6):
import statsmodels as sm
from rpy2.robjects.packages import importr
from rpy2.robjects import pandas2ri
pandas2ri.activate()
quantreg = importr('quantreg')
data = sm.datasets.engel.load_pandas().data
qreg = quantreg.rq('foodexp ~ income', data=data, tau=0.5)
However this generates the following error:
qreg = quantreg.rq('foodexp ~ income', data=data, tau=0.5)
Traceback (most recent call last):
File "<ipython-input-22-02ee1015737c>", line 1, in <module>
quantreg.rq('foodexp ~ income', data=data, tau=0.5)
File "C:\Anaconda\lib\site-packages\rpy2\robjects\functions.py", line 178, in __call__
return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
File "C:\Anaconda\lib\site-packages\rpy2\robjects\functions.py", line 106, in __call__
res = super(Function, self).__call__(*new_args, **new_kwargs)
RRuntimeError: Error in y - x %*% z$coef : non-conformable arrays
From what I understand, non-conformable arrays in this case would mean there are some missing values or the 'arrays' being used are different sizes. I can confirm that this is NOT the case:
data.count()
Out[26]:
income 235
foodexp 235
dtype: int64
data.shape
Out[27]: (235, 2)
What else could this error mean? Is it possible that the conversion from DataFrame to data.frame in rpy2 is not working correctly or maybe I'm missing something here? Can anyone else confirm this error?
Just in case here is some info regarding the version of R and Python.
R version 3.2.0 (2015-04-16) -- "Full of Ingredients"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)
Python 2.7.11 |Anaconda 2.3.0 (64-bit)| (default, Dec 7 2015, 14:10:42) [MSC v.1500 64 bit (AMD64)]
on win32
Any help would be appreciated.
Edit 1:
If I load the dataset directly from R I don't get an error:
from rpy2.robjects import r
r.data('engel')
data = r['engel']
qreg = quantreg.rq('foodexp ~ income', data=data, tau=0.5)
So I think there is something wrong with the conversion with pandas2ri
. The same error occurs when I try to convert the DataFrame to data.frame manually with pandas2ri.py2ri
.
Edit 2:
Interestingly enough, if I used the deprecated pandas.rpy.common.convert_to_r_dataframe
the error is gone:
import pandas.rpy.common as com
rdata = com.convert_to_r_dataframe(data)
qreg = quantreg.rq('foodexp ~ income', data=rdata, tau=0.5)
There is definitely a bug in pandas2ri
which is also confirmed here.
rq('foodexp ~ income', data=engel, tau=0.5)
. I'm wondering if you are actually getting substitution of theengel
dataset into the R environment. – IRTFMpandas2ri
conversion by loading the dataset directly from R in Python (see edit). – pbreach