9
votes

I'm learning python recently, and is doing many practice with the language.

One thing I found interesting is that, when I read from an array, it's almost half of the time slower than list. Does somebody know why?

here's my code:

from timeit import Timer
import array

t = 10000
l = range(t)
a = array.array('i', l)
def LIST():
    for i in xrange(t):
        l[i]

def ARRAY():
    for i in xrange(t):
        a[i]

print Timer(LIST).timeit(1000);
print Timer(ARRAY).timeit(1000);

the output is:

0.813191890717
1.16269612312

which indicates that reading array is slower than list. I think array is a fixed size memory, while list is a dynamic structure. So I assumed that array would be faster than list.

Does anyone has any explanation?

3
possible dupe/answer: stackoverflow.com/questions/176011/… - Basically array.array is a wrapper around a C array so I think there is overhead when accessing it. Don't use it for math.NG.
Trying to second-guess Python efficiencies - especially for those coming from a C-like background - is often counter-intuitive. Code clearly first, then optimize if you measure a performance problem; this applies to C also, but because the language elements are so close to the machine people often forget.msw
For math you may want to use numpy (not available for python 3 yet), only god knows why numpy is not standard library.Robert William Hanks
numpy is very close to available on Python 3: mail-archive.com/[email protected]/msg26524.htmlNed Batchelder

3 Answers

10
votes

lists are "dynamically growing vectors" (very much like C++'s std::vector, say) but that in no way slows down random access to them (they're not linked lists!-). Lists' entries are references to Python objects (the items): accessing one just requires (in CPython) an increment of the item's reference count (in other implementations, based on more advanced garbage collection, not even that;-). Array's entries are raw bits and bytes: accessing one requires a fresh new Python object to be synthesized based on that binary value. So e.g.:

$ python -mtimeit -s'import array; c=array.array("B", "bzap")' 'c[2]'
10000000 loops, best of 3: 0.0903 usec per loop
$ python -mtimeit -s'c=list("bzap")' 'c[2]'
10000000 loops, best of 3: 0.0601 usec per loop

30 nanoseconds' extra time for an access doesn't seem too bad;-).

As an aside, note that timeit is MUCH nicer to use from the command line -- automatic choice of repetition, unit of measure shown for the time, etc. That's how I always use it (importing a custom-coded module with functions to be called, if needed -- but here there is no need for that) -- it's so much handier than importing and using it from a module!

8
votes

It takes time to wrap a raw integer into a Python int.

1
votes

The Python lists really resemble some ways normal arrays, they are not Lisp lists, but they have fast random access.