python - Numpy: find first index of value fast

115

votes

How can I find the index of the first occurrence of a number in a Numpy array? Speed is important to me. I am not interested in the following answers because they scan the whole array and don't stop when they find the first occurrence:

itemindex = numpy.where(array==item)[0][0]
nonzero(array == item)[0][0]

Note 1: none of the answers from that question seem relevant Is there a Numpy function to return the first index of something in an array?

Note 2: using a C-compiled method is preferred to a Python loop.

pythonnumpyfind

63

votes

There is a feature request for this scheduled for Numpy 2.0.0: https://github.com/numpy/numpy/issues/2269

32

votes

Although it is way too late for you, but for future reference: Using numba (1) is the easiest way until numpy implements it. If you use anaconda python distribution it should already be installed. The code will be compiled so it will be fast.

@jit(nopython=True)
def find_first(item, vec):
    """return the index of the first occurence of item in vec"""
    for i in xrange(len(vec)):
        if item == vec[i]:
            return i
    return -1

and then:

>>> a = array([1,7,8,32])
>>> find_first(8,a)
2

24

votes

I've made a benchmark for several methods:

argwhere
nonzero as in the question
.tostring() as in @Rob Reilink's answer
python loop
Fortran loop

The Python and Fortran code are available. I skipped the unpromising ones like converting to a list.

The results on log scale. X-axis is the position of the needle (it takes longer to find if it's further down the array); last value is a needle that's not in the array. Y-axis is the time to find it.

The array had 1 million elements and tests were run 100 times. Results still fluctuate a bit, but the qualitative trend is clear: Python and f2py quit at the first element so they scale differently. Python gets too slow if the needle is not in the first 1%, whereas f2py is fast (but you need to compile it).

To summarize, f2py is the fastest solution, especially if the needle appears fairly early.

It's not built in which is annoying, but it's really just 2 minutes of work. Add this to a file called search.f90:

subroutine find_first(needle, haystack, haystack_length, index)
    implicit none
    integer, intent(in) :: needle
    integer, intent(in) :: haystack_length
    integer, intent(in), dimension(haystack_length) :: haystack
!f2py intent(inplace) haystack
    integer, intent(out) :: index
    integer :: k
    index = -1
    do k = 1, haystack_length
        if (haystack(k)==needle) then
            index = k - 1
            exit
        endif
    enddo
end

If you're looking for something other than integer, just change the type. Then compile using:

f2py -c -m search search.f90

after which you can do (from Python):

import search
print(search.find_first.__doc__)
a = search.find_first(your_int_needle, your_int_array)

11

votes

You can convert a boolean array to a Python string using array.tostring() and then using the find() method:

(array==item).tostring().find('\x01')

This does involve copying the data, though, since Python strings need to be immutable. An advantage is that you can also search for e.g. a rising edge by finding \x00\x01

11

votes

In case of sorted arrays np.searchsorted works.

9

votes

If you are looking for the first non-zero element you can use a following hack:

idx = x.view(bool).argmax() // x.itemsize
idx = idx if x[idx] else -1

It is a very fast "numpy-pure" solution but it fails for some cases discussed below.

The solution takes advantage from the fact that pretty much all representation of zero for numeric types consists of 0 bytes. It applies to numpy's bool as well. In recent versions of numpy, argmax() function uses short-circuit logic when processing the bool type. The size of bool is 1 byte.

So one needs to:

create a view of the array as bool. No copy is created
use argmax() to find the first non-zero byte using short-circuit logic
recalculate the offset of this byte to the index of the first non-zero element by integer division (operator //) of the offset by a size of a single element expressed in bytes (x.itemsize)
check if x[idx] is actually non-zero to identify the case when no non-zero is present

I've made some benchmark against numba solution and build it np.nonzero.

import numpy as np
from numba import jit
from timeit import timeit

def find_first(x):
    idx = x.view(bool).argmax() // x.itemsize
    return idx if x[idx] else -1

@jit(nopython=True)
def find_first_numba(vec):
    """return the index of the first occurence of item in vec"""
    for i in range(len(vec)):
        if vec[i]:
            return i
    return -1


SIZE = 10_000_000
# First only
x = np.empty(SIZE)

find_first_numba(x[:10])

print('---- FIRST ----')
x[:] = 0
x[0] = 1
print('ndarray.nonzero', timeit(lambda: x.nonzero()[0][0], number=100)*10, 'ms')
print('find_first', timeit(lambda: find_first(x), number=1000), 'ms')
print('find_first_numba', timeit(lambda: find_first_numba(x), number=1000), 'ms')

print('---- LAST ----')
x[:] = 0
x[-1] = 1
print('ndarray.nonzero', timeit(lambda: x.nonzero()[0][0], number=100)*10, 'ms')
print('find_first', timeit(lambda: find_first(x), number=100)*10, 'ms')
print('find_first_numba', timeit(lambda: find_first_numba(x), number=100)*10, 'ms')

print('---- NONE ----')
x[:] = 0
print('ndarray.nonzero', timeit(lambda: x.nonzero()[0], number=100)*10, 'ms')
print('find_first', timeit(lambda: find_first(x), number=100)*10, 'ms')
print('find_first_numba', timeit(lambda: find_first_numba(x), number=100)*10, 'ms')

print('---- ALL ----')
x[:] = 1
print('ndarray.nonzero', timeit(lambda: x.nonzero()[0][0], number=100)*10, 'ms')
print('find_first', timeit(lambda: find_first(x), number=100)*10, 'ms')
print('find_first_numba', timeit(lambda: find_first_numba(x), number=100)*10, 'ms')

The result on my machine are:

---- FIRST ----
ndarray.nonzero 57.63976670001284 ms
find_first 0.0010841979965334758 ms
find_first_numba 0.0002308919938514009 ms
---- LAST ----
ndarray.nonzero 58.96685277999495 ms
find_first 5.923203580023255 ms
find_first_numba 8.762269750004634 ms
---- NONE ----
ndarray.nonzero 25.13398071998381 ms
find_first 5.924289370013867 ms
find_first_numba 8.810063839919167 ms
---- ALL ----
ndarray.nonzero 55.181210660084616 ms
find_first 0.001246920000994578 ms
find_first_numba 0.00028766007744707167 ms

The solution is 33% faster than numba and it is "numpy-pure".

The disadvantages:

does not work for numpy acceptable types like object
fails for negative zero that occasionally appears in float or double computations

7

votes

I think you have hit a problem where a different method and some a priori knowledge of the array would really help. The kind of thing where you have a X probability of finding your answer in the first Y percent of the data. The splitting up the problem with the hope of getting lucky then doing this in python with a nested list comprehension or something.

Writing a C function to do this brute force isn't too hard using ctypes either.

The C code I hacked together (index.c):

long index(long val, long *data, long length){
    long ans, i;
    for(i=0;i<length;i++){
        if (data[i] == val)
            return(i);
    }
    return(-999);
}

and the python:

# to compile (mac)
# gcc -shared index.c -o index.dylib
import ctypes
lib = ctypes.CDLL('index.dylib')
lib.index.restype = ctypes.c_long
lib.index.argtypes = (ctypes.c_long, ctypes.POINTER(ctypes.c_long), ctypes.c_long)

import numpy as np
np.random.seed(8675309)
a = np.random.random_integers(0, 100, 10000)
print lib.index(57, a.ctypes.data_as(ctypes.POINTER(ctypes.c_long)), len(a))

and I get 92.

Wrap up the python into a proper function and there you go.

The C version is a lot (~20x) faster for this seed (warning I am not good with timeit)

import timeit
t = timeit.Timer('np.where(a==57)[0][0]', 'import numpy as np; np.random.seed(1); a = np.random.random_integers(0, 1000000, 10000000)')
t.timeit(100)/100
# 0.09761879920959472
t2 = timeit.Timer('lib.index(57, a.ctypes.data_as(ctypes.POINTER(ctypes.c_long)), len(a))', 'import numpy as np; np.random.seed(1); a = np.random.random_integers(0, 1000000, 10000000); import ctypes; lib = ctypes.CDLL("index.dylib"); lib.index.restype = ctypes.c_long; lib.index.argtypes = (ctypes.c_long, ctypes.POINTER(ctypes.c_long), ctypes.c_long) ')
t2.timeit(100)/100
# 0.005288000106811523

6

votes

@tal already presented a numba function to find the first index but that only works for 1D arrays. With np.ndenumerate you can also find the first index in an arbitarly dimensional array:

from numba import njit
import numpy as np

@njit
def index(array, item):
    for idx, val in np.ndenumerate(array):
        if val == item:
            return idx
    return None

Sample case:

>>> arr = np.arange(9).reshape(3,3)
>>> index(arr, 3)
(1, 0)

Timings show that it's similar in performance to tals solution:

arr = np.arange(100000)
%timeit index(arr, 5)           # 1000000 loops, best of 3: 1.88 µs per loop
%timeit find_first(5, arr)      # 1000000 loops, best of 3: 1.7 µs per loop

%timeit index(arr, 99999)       # 10000 loops, best of 3: 118 µs per loop
%timeit find_first(99999, arr)  # 10000 loops, best of 3: 96 µs per loop

3

votes

If your list is sorted, you can achieve very quick search of index with the 'bisect' package. It's O(log(n)) instead of O(n).

bisect.bisect(a, x)

finds x in the array a, definitely quicker in the sorted case than any C-routine going through all the first elements (for long enough lists).

It's good to know sometimes.

3

votes

This problem can be effectively solved in pure numpy by processing the array in chunks:

def find_first(x):
    idx, step = 0, 32
    while idx < x.size:
        nz, = x[idx: idx + step].nonzero()
        if len(nz): # found non-zero, return it
            return nz[0] + idx
        # move to the next chunk, increase step
        idx += step
        step = min(9600, step + step // 2)
    return -1

The array is processed in chunk of size step. The step longer the step is, the faster is processing of zeroed-array (worst case). The smaller it is, the faster processing of array with non-zero at the start. The trick is to start with a small step and increase it exponentially. Moreover, there is no need to increment it above some threshold due to limited benefits.

I've compared the solution with pure ndarary.nonzero and numba solution against 10 million array of floats.

import numpy as np
from numba import jit
from timeit import timeit

def find_first(x):
    idx, step = 0, 32
    while idx < x.size:
        nz, = x[idx: idx + step].nonzero()
        if len(nz):
            return nz[0] + idx
        idx += step
        step = min(9600, step + step // 2)
    return -1

@jit(nopython=True)
def find_first_numba(vec):
    """return the index of the first occurence of item in vec"""
    for i in range(len(vec)):
        if vec[i]:
            return i
    return -1


SIZE = 10_000_000
# First only
x = np.empty(SIZE)

find_first_numba(x[:10])

print('---- FIRST ----')
x[:] = 0
x[0] = 1
print('ndarray.nonzero', timeit(lambda: x.nonzero()[0][0], number=100)*10, 'ms')
print('find_first', timeit(lambda: find_first(x), number=1000), 'ms')
print('find_first_numba', timeit(lambda: find_first_numba(x), number=1000), 'ms')

print('---- LAST ----')
x[:] = 0
x[-1] = 1
print('ndarray.nonzero', timeit(lambda: x.nonzero()[0][0], number=100)*10, 'ms')
print('find_first', timeit(lambda: find_first(x), number=100)*10, 'ms')
print('find_first_numba', timeit(lambda: find_first_numba(x), number=100)*10, 'ms')

print('---- NONE ----')
x[:] = 0
print('ndarray.nonzero', timeit(lambda: x.nonzero()[0], number=100)*10, 'ms')
print('find_first', timeit(lambda: find_first(x), number=100)*10, 'ms')
print('find_first_numba', timeit(lambda: find_first_numba(x), number=100)*10, 'ms')

print('---- ALL ----')
x[:] = 1
print('ndarray.nonzero', timeit(lambda: x.nonzero()[0][0], number=100)*10, 'ms')
print('find_first', timeit(lambda: find_first(x), number=100)*10, 'ms')
print('find_first_numba', timeit(lambda: find_first_numba(x), number=100)*10, 'ms')

And results on my machine:

---- FIRST ----
ndarray.nonzero 54.733994480002366 ms
find_first 0.0013148509997336078 ms
find_first_numba 0.0002839310000126716 ms
---- LAST ----
ndarray.nonzero 54.56336712999928 ms
find_first 25.38929685000312 ms
find_first_numba 8.022820680002951 ms
---- NONE ----
ndarray.nonzero 24.13432420999925 ms
find_first 25.345200140000088 ms
find_first_numba 8.154927100003988 ms
---- ALL ----
ndarray.nonzero 55.753537260002304 ms
find_first 0.0014760300018679118 ms
find_first_numba 0.0004358099977253005 ms

Pure ndarray.nonzero is definite looser. The numba solution is circa 5 times faster for the best case. It is circa 3 times faster in the worst case.

2

votes

As far as I know only np.any and np.all on boolean arrays are short-circuited.

In your case, numpy has to go through the entire array twice, once to create the boolean condition and a second time to find the indices.

My recommendation in this case would be to use cython. I think it should be easy to adjust an example for this case, especially if you don't need much flexibility for different dtypes and shapes.

2

votes

I needed this for my job so I taught myself Python and Numpy's C interface and wrote my own. http://pastebin.com/GtcXuLyd It's only for 1-D arrays, but works for most data types (int, float, or strings) and testing has shown it is again about 20 times faster than the expected approach in pure Python-numpy.

1

votes

As a longtime matlab user I a have been searching for an efficient solution to this problem for quite a while. Finally, motivated by discussions a propositions in this thread I have tried to come up with a solution that is implementing an API similar to what was suggested here, supporting for the moment only 1D arrays.

You would use it like this

import numpy as np
import utils_find_1st as utf1st
array = np.arange(100000)
item = 1000
ind = utf1st.find_1st(array, item, utf1st.cmp_larger_eq)

The condition operators supported are: cmp_equal, cmp_not_equal, cmp_larger, cmp_smaller, cmp_larger_eq, cmp_smaller_eq. For efficiency the extension is written in c.

You find the source, benchmarks and other details here:

https://pypi.python.org/pypi?name=py_find_1st&:action=display

for the use in our team (anaconda on linux and macos) I have made an anaconda installer that simplifies installation, you may use it as described here

https://anaconda.org/roebel/py_find_1st

0

votes

Just a note that if you are doing a sequence of searches, the performance gain from doing something clever like converting to string, might be lost in the outer loop if the search dimension isn't big enough. See how the performance of iterating find1 that uses the string conversion trick proposed above and find2 that uses argmax along the inner axis (plus an adjustment to ensure a non-match returns as -1)

import numpy,time
def find1(arr,value):
    return (arr==value).tostring().find('\x01')

def find2(arr,value): #find value over inner most axis, and return array of indices to the match
    b = arr==value
    return b.argmax(axis=-1) - ~(b.any())


for size in [(1,100000000),(10000,10000),(1000000,100),(10000000,10)]:
    print(size)
    values = numpy.random.choice([0,0,0,0,0,0,0,1],size=size)
    v = values>0

    t=time.time()
    numpy.apply_along_axis(find1,-1,v,1)
    print('find1',time.time()-t)

    t=time.time()
    find2(v,1)
    print('find2',time.time()-t)

outputs

(1, 100000000)
('find1', 0.25300002098083496)
('find2', 0.2780001163482666)
(10000, 10000)
('find1', 0.46200013160705566)
('find2', 0.27300000190734863)
(1000000, 100)
('find1', 20.98099994659424)
('find2', 0.3040001392364502)
(10000000, 10)
('find1', 206.7590000629425)
('find2', 0.4830000400543213)

That said, a find written in C would be at least a little faster than either of these approaches

0

votes

how about this

import numpy as np
np.amin(np.where(array==item))

-1

votes

You can covert your array into a list and use it's index() method:

i = list(array).index(item)

As far as I'm aware, this is a C compiled method.

python - Numpy: find first index of value fast

16 Answers