2
votes

Example:

import cProfile, random, copy
def foo(lIn): return [i*i for i in lIn]
lIn = [random.random() for i in range(1000000)]
lIn1 = copy.copy(lIn)
lIn2 = sorted(lIn1)
cProfile.run('foo(lIn)')
cProfile.run('foo(lIn2)')

Result:

3 function calls in 0.075 seconds

Ordered by: standard name


   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.005    0.005    0.075    0.075 :1()
        1    0.070    0.070    0.070    0.070 test.py:716(foo)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

3 function calls in 0.143 seconds

Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.006    0.006    0.143    0.143 :1()
        1    0.137    0.137    0.137    0.137 test.py:716(foo)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



1
it doesn't make sense - xws
It doesn't really seem to have anything to do with the sort. You can do random.shuffle(lIn1) instead of the sort and cProfile.run('foo(lIn1)') and you'll get the same result. - sneep
What caused the processing time to double? - xws
Maybe the first list is still in cache? And you are using lIn, not lIn1 in the first test call. - Graipher

1 Answers

0
votes

Not really an answer yet, but the comment margin is a bit too small for this.

As random.shuffle() would yield the same result, I decided to implement my own shuffle function and vary the amount of times I'd shuffle. (In the below example, it's the parameter to xrange, 300000.

def my_shuffle(array):
    for _ in xrange(300000):
        rand1 = random.randint(0, 999999)
        rand2 = random.randint(0, 999999)
        array[rand1], array[rand2] = array[rand2], array[rand1]

The other code is pretty much unmodified:

import cProfile, random, copy
def foo(lIn): return [i*i for i in lIn]
lIn = [random.random()*100000 for i in range(1000000)]
lIn1 = copy.copy(lIn)
my_shuffle(lIn1)
cProfile.run('foo(lIn)')
cProfile.run('foo(lIn1)')

The results I got for the second cProfile depended on the number of times I shuffled:

10000 0.062
100000 0.082
200000 0.099
400000 0.122
800000 0.137
8000000 0.141
10000000 0.141
100000000 0.248

It looks like the more you mess an array up, the longer operations take, up to a certain point. (I don't know about the last result. It took so long that I did some light other stuff in the background and don't really want to retry.)