I need to find a faster way to find the swaps in a 8-11 character string, in the following manner:
Given a string 'STDILGNLYE'
, find all the single letter swaps for the letters:
list_AA = ['A', 'R', 'N', 'D', 'C', 'Q', 'E', 'G', 'H', 'I', 'L', 'K', 'M',
'F', 'P', 'S', 'T', 'W', 'Y', 'V']
i.e, for each letter in the string, substitute each letter in the original string with one in list_aa
. The output would be:
ATDILGNLYE
RTDILGNLYE
NTDILGNLYE
...
SADILGNLYE
SRDILGNLYE
SNDILGNLYE
...
...
STDILGNLYV
For a total of 200 new strings (20 swaps per each position in the string). What I have so far:
def _create_swaps(original_str):
list_peps = []
for i in range(len(original_str)):
for k in range(len(list_AA)):
list_peps.append(_insert_aa(original_str, i, list_aa[k]))
#remove original string
return [i for i in list_peps if i != original_str]
def _insert_aa(string, index, aa):
list_string_elements = list(string)
del list_string_elements[index]
hash_string.insert(index, aa)
return "".join(hash_string)
Since this needs to be repeated ~10**6 times, it is the slowest step in a larger project. Is there a way of finding such swaps in a faster manner (by eliminating the "".join
, insert, steps/ by finding swaps on the fly)?
For reference:
ncalls tottime percall cumtime percall filename:lineno(function)
185275200 330.286 0.000 429.295 0.000 models.py:233(_insert_aa)
975240 147.322 0.000 616.979 0.001 models.py:225(_create_swaps)
185280201/185280197 59.137 0.000 59.138 0.000 {method 'join' of 'str' objects}
185275208 39.875 0.000 39.875 0.000 {method 'insert' of 'list' objects}
975240 21.027 0.000 21.027 0.000 models.py:231(<listcomp>)
186746064 18.516 0.000 18.516 0.000 {method 'append' of 'list' objects}
_create_swaps
, it returns all created strings except the original one. – Carlo Mazzaferromap()
...see this article on loop efficiency...of course profiling is always better than theory though... – Albert Rothman