26
votes

I have a matrix like this

t = np.array([[1,2,3,'foo'],
 [2,3,4,'bar'],
 [5,6,7,'hello'],
 [8,9,1,'bar']])

I want to get the indices where the rows contain the string 'bar'

In a 1d array

rows = np.where(t == 'bar')

should give me the indices [0,3] followed by broadcasting:-

results = t[rows]

should give me the right rows

But I can't figure out how to get it to work with 2d arrays.

2
What happens instead? What have you tried? - jonrsharpe
Just to check, is this actually how you created your array? Note that what you've done gives an array of strings. If you want a mix of strings and integers, you'll have a record array and it will behave differently. - Andrew Jaffe
I did it as above and gone dtype='<U5' which I guess is the smallest datatype numpy managed to fit this array type in. Jaime's answer worked though I had never thought about the row,cols separation before - Delta_Fore

2 Answers

27
votes

You have to slice the array to the col you want to index:

rows = np.where(t[:,3] == 'bar')
result = t1[rows]

This returns:

 [[2,3,4,'bar'],
  [8,9,1,'bar']]
20
votes

For the general case, where your search string can be in any column, you can do this:

>>> rows, cols = np.where(t == 'bar')
>>> t[rows]
array([['2', '3', '4', 'bar'],
       ['8', '9', '1', 'bar']],
      dtype='|S11')