1
votes

I have two lists 'start' and 'end'. They are of the same length (4 million each):

for i in xrange(0,len(start)):
      print start[i], end[i]

3000027 3000162
3000162 3000186
3000186 3000187
3000187 3005000
3005000 3005020
3005020 3005090
3007000 3007186
3007186 3009000
3009000 3009500
.......

My problem is that I want to iterate the two lists, starting at the same point, but but progressively iterate along the 'end list' until I find a value where the difference between 'start[i]' and 'end[i+x]' is greater than 1000.

enter image description here

I have made my best attempt at doing this where I use an endless loop to iterate the 'end list' until the difference with start exceeds 1000 and then start from that point and perform the same operation from there...

NOTE: old content omitted

Ultimately the output that I am looking for is (taking the illustrative figure above as an example):

print density
  [4, 2, 1 ...........]

Can anyone help me with this?

UPDATE

While the previous answer to this question does indeed work:

density=[]
i_s = 0
while i_s < len(start):
     i_e = i_s
     while i_e < len(end):
         if end[i_e] - start[i_s] > 1000:
             density.append(i_e - i_s + 1)
             i_s = i_e
             break
         i_e += 1
     i_s += 1


print sum(density)/float(len(density))
print max(density)
print min(density)

I am afraid that the code is extremely slow as I am updating the extension of 'i_e' by adding 1 to it with each iteration of the inner while loop... To solve this I wanted to create a 'counter' variable that will extend the 'i_e' variable dynamically. This will be done with recursion whereby the i_e variable will be increased exponentially up until the point where half the desired distance is reached and then will be exponentially decreased until the desired distance is reached.

Strategy illustration

enter image description here

My attempt at this is as follows:

I created a recursive a function to update a variable 'counter'

counter=1 ##### initialise counter with value of 1
def exponentially_increase_decrease(start, end, counter):
    distance=end-start
    if distance<=500:  ###500 is half the desired distance
        counter=exponentially_increase_decrease(start, end, counter*2)
    else:
        counter=-exponentially_increase_decrease(start, end, counter/2)
    print counter
    return counter

Calling the function within the original code:

density=[]
i_s = 0
while i_s < len(start):
     i_e = i_s
     while i_e < len(end):
         if end[i_e] - start[i_s] > 1000:
             density.append(i_e - i_s + 1)
             i_s = i_e
             break
         counter=counter=exponentially_increase_decrease(i_s, i_e, counter)
         i_e += counter
     i_s += 1

I get the following error:

(Printed thousands of times)

counter=exponentially_increase_decrease(start, end, counter*2)
RuntimeError: maximum recursion depth exceeded

I am not experienced with this kind of problem and am not sure if I am approaching it correctly... can anyone help?

2
Are the start, end lists both sorted (ordered)?Menglong Li
Why isn't everything indented after the while loop? I would think it needs to be.9Breaker
apologies the code is indented as it should be now and yes both lists are sortedwhich_command
I just don't get what you're trying to do. Why doesn't 4005090 in the end list compared to 300500 in the start list right 1 position behind meet the condition?dabadaba
What is the output you are looking for? I still don't get it.juanpa.arrivillaga

2 Answers

2
votes

This is one of the few cases where I find while loops to be more straight-forward, since i_e and i_s depend on each other. You could use two range iterators, and advance the former by how much you've consumed from the latter, but that seems overly complicated.

>>> start
[3000027, 3000162, 3000186, 3000187, 3005000, 3005020, 3007000, 3007186, 3009000]
>>> end
[3000162, 3000186, 3000187, 3005000, 3005020, 3005090, 3007186, 3009000, 3009500]
>>> i_s = 0
>>> while i_s < len(start):
...     i_e = i_s
...     while i_e < len(end):
...         if end[i_e] - start[i_s] > 1000:
...             print(i_e - i_s + 1)
...             i_s = i_e
...             break
...         i_e += 1
...     i_s += 1
...
4
3
1
0
votes

Not sure if I understood correctly... Is this what you are looking for?

MAX_DIFF = 1000
density = [0] * len(start)
for i in range(len(start)):
    for j in range(i, len(end)):
        density[i] += 1
        if end[i] - start[i] >= MAX_DIFF:
            break
print(density)