153
votes

I am kind of getting stuck on extracting value of one variable conditioning on another variable. For example, the following dataframe:

A  B
p1 1
p1 2
p3 3
p2 4

How can I get the value of A when B=3? Every time when I extracted the value of A, I got an object, not a string.

6
I see, I should add item() at the end. - Gejun
df.query and pd.eval seem like good fits for this use case. For information on the pd.eval() family of functions, their features and use cases, please visit Dynamic Expression Evaluation in pandas using pd.eval(). - cs95

6 Answers

263
votes

You could use loc to get series which satisfying your condition and then iloc to get first element:

In [2]: df
Out[2]:
    A  B
0  p1  1
1  p1  2
2  p3  3
3  p2  4

In [3]: df.loc[df['B'] == 3, 'A']
Out[3]:
2    p3
Name: A, dtype: object

In [4]: df.loc[df['B'] == 3, 'A'].iloc[0]
Out[4]: 'p3'
58
votes

You can try query, which is less typing:

df.query('B==3')['A']
35
votes

df[df['B']==3]['A'], assuming df is your pandas.DataFrame.

16
votes

Use df[df['B']==3]['A'].values if you just want item itself without the brackets

9
votes

It's easier for me to think in these terms, but borrowing from other answers. The value you want is located in the series:

df[*column*][*row*]

where column and row point to the value you want returned. For your example, column is 'A' and for row you use a mask:

df['B'] == 3

To get the value from the series there are several options:

df['A'][df['B'] == 3].values[0]
df['A'][df['B'] == 3].iloc[0]
df['A'][df['B'] == 3].to_numpy()[0]
1
votes
male_avgtip=(tips_data.loc[tips_data['sex'] == 'Male', 'tip']).mean()

I have also worked on this clausing and extraction operations for my assignment.