Note: I'm a Ruby developer trying to find my way in Python.
When I wanted to figure out why some scripts use mylist[:]
instead of list(mylist)
to duplicate lists, I made a quick benchmark of the various methods to duplicate range(10)
(see code below).
EDIT: I updated the tests to make use of Python's timeit
as suggested below. This makes it impossible to directly compare it to Ruby, because timeit doesn't account for the looping while Ruby's Benchmark
does, so Ruby code is for reference only.
Python 2.7.2
Array duplicating. Tests run 50000000 times
list(a) 18.7599430084
copy(a) 59.1787488461
a[:] 9.58828091621
a[0:len(a)] 14.9832749367
For reference, I wrote the same script in Ruby too:
Ruby 1.9.2p0
Array duplicating. Tests 50000000 times
user system total real
Array.new(a) 14.590000 0.030000 14.620000 ( 14.693033)
Array[*a] 18.840000 0.060000 18.900000 ( 19.156352)
a.take(a.size) 8.780000 0.020000 8.800000 ( 8.805700)
a.clone 16.310000 0.040000 16.350000 ( 16.384711)
a[0,a.size] 8.950000 0.020000 8.970000 ( 8.990514)
Question 1: what is mylist[:]
doing differently that it is 25 % faster than even mylist[0:len(mylist)]
. Does it copy in memory directly or what?
Question 2: edit: updated benchmarks don't show huge differences in Python and Ruby anymore. was: Did I implement the tests in some obviously inefficient way, so that Ruby code is so much faster than Python?
Now the code listings:
Python:
import timeit
COUNT = 50000000
print "Array duplicating. Tests run", COUNT, "times"
setup = 'a = range(10); import copy'
print "list(a)\t\t", timeit.timeit(stmt='list(a)', setup=setup, number=COUNT)
print "copy(a)\t\t", timeit.timeit(stmt='copy.copy(a)', setup=setup, number=COUNT)
print "a[:]\t\t", timeit.timeit(stmt='a[:]', setup=setup, number=COUNT)
print "a[0:len(a)]\t", timeit.timeit(stmt='a[0:len(a)]', setup=setup, number=COUNT)
Ruby:
require 'benchmark'
a = (0...10).to_a
COUNT = 50_000_000
puts "Array duplicating. Tests #{COUNT} times"
Benchmark.bm(16) do |x|
x.report("Array.new(a)") {COUNT.times{ Array.new(a) }}
x.report("Array[*a]") {COUNT.times{ Array[*a] }}
x.report("a.take(a.size)") {COUNT.times{ a.take(a.size) }}
x.report("a.clone") {COUNT.times{ a.clone }}
x.report("a[0,a.size]"){COUNT.times{ a[0,a.size] }}
end
timeit
module to measure python execution times. I doubt it'll make things (much) faster but it'll avoid all the usual timing traps. – Martijn Pieters♦alist[:]
versusalist[0:len(alist)]
; the latter creates pythonint
objects, something the former method doesn't need to deal with. – Martijn Pieters♦len
(and call it) each time – mgilsonArray(a)
does not duplicate an array. When given an array it just callsto_ary
on it, which returnsself
. You should also use Ruby's Benchmark library instead of doing your timing manually. – Andrew Marshallobj.dup
in Ruby and benchmark too. – texasbruce