0
votes

I have a dataframe that looks something like the one below.

   Date       Company_Name    Rtns    Sentiment    Market_cap    Beta
0  1/1/2000   abc             0.2     1            1234          1.0
1  1/2/2000   abc             0.5     1            1221          1.0
2  1/3/2000   abc             0.4     0            1532          1.2
       .
       .
50 1/12/2011  abc             0.02    1            1211          0.9
51 1/1/2001   def             0.03    0             118          1.6
52 1/2/2001   def             0.13    0             117          1.2
53 1/3/2001   def             0.02    1             117          1.3

I am attempting to do an OLS regression with one group of company at a time(i.e. regression of company abc, regression of company def), with all variables regressed against the Returns. This is what I have done below (I have used the for loop but i'm not sure how to use the index or using the name of the companies to get the regression) :

 y = df['Rtns']
 x = df[['Sentiment', 'Market_cap', 'Beta']]

 summ= []
 for i in df:
    model = sm.OLS((y,x)).fit()
    summ.append(model.summary())

The output given to me was a regression model that was repetitive.

I'm not sure how do I go about regressing each group of companies (i.e, regression result of abc, and regression result of def.)

I have also used the groupby function to group the companies but i'm not sure how to proceed from here.

Thanks for anyone who's able to help out.

1

1 Answers

1
votes

Something like below would do. Filter the data and then do the OLS for that data..

 lst=['abc','def']

 for i in lst:
    tmp=df[df['Company_Name']==i]
    y=tmp['Rtns']
    x=tmp[['Sentiment', 'Market_cap', 'Beta']]
    model = sm.OLS((y,x)).fit()