In Python remove()
will remove the first occurrence of value in a list.
How to remove all occurrences of a value from a list?
This is what I have in mind:
>>> remove_values_from_list([1, 2, 3, 4, 2, 2, 3], 2)
[1, 3, 4, 3]
All of the answers above (apart from Martin Andersson's) create a new list without the desired items, rather than removing the items from the original list.
>>> import random, timeit
>>> a = list(range(5)) * 1000
>>> random.shuffle(a)
>>> b = a
>>> print(b is a)
True
>>> b = [x for x in b if x != 0]
>>> print(b is a)
False
>>> b.count(0)
0
>>> a.count(0)
1000
>>> b = a
>>> b = filter(lambda a: a != 2, x)
>>> print(b is a)
False
This can be important if you have other references to the list hanging around.
To modify the list in place, use a method like this
>>> def removeall_inplace(x, l):
... for _ in xrange(l.count(x)):
... l.remove(x)
...
>>> removeall_inplace(0, b)
>>> b is a
True
>>> a.count(0)
0
As far as speed is concerned, results on my laptop are (all on a 5000 entry list with 1000 entries removed)
So the .remove loop is about 100x slower........ Hmmm, maybe a different approach is needed. The fastest I've found is using the list comprehension, but then replace the contents of the original list.
>>> def removeall_replace(x, l):
.... t = [y for y in l if y != x]
.... del l[:]
.... l.extend(t)
Numpy approach and timings against a list/array with 1.000.000 elements:
Timings:
In [10]: a.shape
Out[10]: (1000000,)
In [13]: len(lst)
Out[13]: 1000000
In [18]: %timeit a[a != 2]
100 loops, best of 3: 2.94 ms per loop
In [19]: %timeit [x for x in lst if x != 2]
10 loops, best of 3: 79.7 ms per loop
Conclusion: numpy is 27 times faster (on my notebook) compared to list comprehension approach
PS if you want to convert your regular Python list lst
to numpy array:
arr = np.array(lst)
Setup:
import numpy as np
a = np.random.randint(0, 1000, 10**6)
In [10]: a.shape
Out[10]: (1000000,)
In [12]: lst = a.tolist()
In [13]: len(lst)
Out[13]: 1000000
Check:
In [14]: a[a != 2].shape
Out[14]: (998949,)
In [15]: len([x for x in lst if x != 2])
Out[15]: 998949
At the cost of readability, I think this version is slightly faster as it doesn't force the while to reexamine the list, thus doing exactly the same work remove has to do anyway:
x = [1, 2, 3, 4, 2, 2, 3]
def remove_values_from_list(the_list, val):
for i in range(the_list.count(val)):
the_list.remove(val)
remove_values_from_list(x, 2)
print(x)
Let
>>> x = [1, 2, 3, 4, 2, 2, 3]
The simplest and efficient solution as already posted before is
>>> x[:] = [v for v in x if v != 2]
>>> x
[1, 3, 4, 3]
Another possibility which should use less memory but be slower is
>>> for i in range(len(x) - 1, -1, -1):
if x[i] == 2:
x.pop(i) # takes time ~ len(x) - i
>>> x
[1, 3, 4, 3]
Timing results for lists of length 1000 and 100000 with 10% matching entries: 0.16 vs 0.25 ms, and 23 vs 123 ms.
lists = [6.9,7,8.9,3,5,4.9,1,2.9,7,9,12.9,10.9,11,7]
def remove_values_from_list():
for list in lists:
if(list!=7):
print(list)
remove_values_from_list()
Result: 6.9 8.9 3 5 4.9 1 2.9 9 12.9 10.9 11
lists = [6.9,7,8.9,3,5,4.9,1,2.9,7,9,12.9,10.9,11,7]
def remove_values_from_list(remove):
for list in lists:
if(list!=remove):
print(list)
remove_values_from_list(7)
Result: 6.9 8.9 3 5 4.9 1 2.9 9 12.9 10.9 11
hello = ['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd']
#chech every item for a match
for item in range(len(hello)-1):
if hello[item] == ' ':
#if there is a match, rebuild the list with the list before the item + the list after the item
hello = hello[:item] + hello [item + 1:]
print hello
['h', 'e', 'l', 'l', 'o', 'w', 'o', 'r', 'l', 'd']
We can also do in-place remove all using either del
or pop
:
import random
def remove_values_from_list(lst, target):
if type(lst) != list:
return lst
i = 0
while i < len(lst):
if lst[i] == target:
lst.pop(i) # length decreased by 1 already
else:
i += 1
return lst
remove_values_from_list(None, 2)
remove_values_from_list([], 2)
remove_values_from_list([1, 2, 3, 4, 2, 2, 3], 2)
lst = remove_values_from_list([random.randrange(0, 10) for x in range(1000000)], 2)
print(len(lst))
Now for the efficiency:
In [21]: %timeit -n1 -r1 x = random.randrange(0,10)
1 loop, best of 1: 43.5 us per loop
In [22]: %timeit -n1 -r1 lst = [random.randrange(0, 10) for x in range(1000000)]
g1 loop, best of 1: 660 ms per loop
In [23]: %timeit -n1 -r1 lst = remove_values_from_list([random.randrange(0, 10) for x in range(1000000)]
...: , random.randrange(0,10))
1 loop, best of 1: 11.5 s per loop
In [27]: %timeit -n1 -r1 x = random.randrange(0,10); lst = [a for a in [random.randrange(0, 10) for x in
...: range(1000000)] if x != a]
1 loop, best of 1: 710 ms per loop
As we see that in-place version remove_values_from_list()
does not require any extra memory, but it does take so much more time to run:
No one has posted an optimal answer for time and space complexity, so I thought I would give it a shot. Here is a solution that removes all occurrences of a specific value without creating a new array and at an efficient time complexity. The drawback is that the elements do not maintain order.
Time complexity: O(n)
Additional space complexity: O(1)
def main():
test_case([1, 2, 3, 4, 2, 2, 3], 2) # [1, 3, 3, 4]
test_case([3, 3, 3], 3) # []
test_case([1, 1, 1], 3) # [1, 1, 1]
def test_case(test_val, remove_val):
remove_element_in_place(test_val, remove_val)
print(test_val)
def remove_element_in_place(my_list, remove_value):
length_my_list = len(my_list)
swap_idx = length_my_list - 1
for idx in range(length_my_list - 1, -1, -1):
if my_list[idx] == remove_value:
my_list[idx], my_list[swap_idx] = my_list[swap_idx], my_list[idx]
swap_idx -= 1
for pop_idx in range(length_my_list - swap_idx - 1):
my_list.pop() # O(1) operation
if __name__ == '__main__':
main()
About the speed!
import time
s_time = time.time()
print 'start'
a = range(100000000)
del a[:]
print 'finished in %0.2f' % (time.time() - s_time)
# start
# finished in 3.25
s_time = time.time()
print 'start'
a = range(100000000)
a = []
print 'finished in %0.2f' % (time.time() - s_time)
# start
# finished in 2.11