0
votes

I recently came across a solution for the subarray sum problem with two pointers and I'm not really sure about the correctness of the algorithm. The subarray sum problem consists of finding a subarray whose sum is equal to some target value. I've seen solutions of this problem using a hash table but I question the correctness of this other solution. Let me give you guys an example of the algorithm.

Suppose you have an array [1,3,2,5,1,1,2,3] and a target value x = 8.

Use left and right pointers that indicate the beginning and end of the subarray. On each step the left pointer moves one step forward and the right pointer moves forward as long as the subarray sum is at most x. This is how the algorithm will progress for the above array.

[1,3,2,5,1,1,2,3]
l    r

[1,3,2,5,1,1,2,3] ##The right pointer doesn't move because the next element would make the sum larger than x
   l r

[1,3,2,5,1,1,2,3]
     l   r

The sum is now equal to the target and the algorithm ceases. To me the algorithm doesn't seem to be valid for edge cases. But it looks right for every test case i give it and it runs O(n). Can someone give me a proof of correctness on this?

Thanks

edit - For the sake of argument assume positive integers.

2
If you are interested in proof of correctness. please let the community know what have you tried. Asking for whole proof of correctness sounds broad question. - Shridhar R Kulkarni
To me the algorithm doesn't seem to be valid for edge cases. What are those edge cases? - Shridhar R Kulkarni

2 Answers

1
votes

Let LEN(i) be the length of the longest subarray starting at index i with sum at most X

If there is a subarray with sum X that starts at index i, then LEN(i) will be the length of such a subarray (there may be multiple such arrays with trailing 0s). Since there are only positive integers in the array, all longer arrays starting at i will have a greater sum, and all shorter ones will have an equal or lesser sum.

So all we need to do is find LEN(i) for each index and the sum of the corresponding subarray. If one of those sums is X then you have the answer.

Consider the LEN(i) subarray for any index. If it's non-empty and we remove the first item, then the resulting subarray, starting at i+1, will have a lesser of equal sum. Therefore LEN(i+1) >= max(LEN(i-1),0).

We can rewrite that as LEN(i+1) >= max(LEN(i),1)-1, and this is the fact that the two pointer algorithm uses to achieve O(n) time.

We start by setting the left and right pointers to the start and end of the max(LEN(0),1) subarray at 0, and check its sum. Then we move the left pointer up and we know from the above equation that right pointer can only move forward, so we move it out to the end of the max(LEN(1),1) subarray, and check its sum.

We proceed to the end of array checking the sum of the corresponding subarray starting at every index.

0
votes

The two pointer approach won't work for this problem if negative integers are present in the array.

Counter example:

[-15, 2, 4, 8, -9, 5, 10, 23] and target value = 15

Considering only positive integers are being used.

Following points help us prove that the algorithm works:

  1. The algorithm terminates: The right pointer keeps moving right as long as the sub-array sum is not greater than target value.

a. If a sub-array with sum = target value is not found, the right pointer index becomes = array size and the algorithm terminates.

b. If a sub-array with sum = target value is found, we report it and the algorithm terminates.

  1. The algorithm does not miss to detect the sub-array with sum = target value, if present.

Let there be a sub-array [i, j] whose sum = target value. For the sake of contradiction, assume that our algorithm doesn't detect such a sub-array.

a. If that is the case then the right pointer either keeps moving to the right of index j, which is not possible because our algorithm doesn't allow the right pointer to move right if the sub-array sum > target value.

b. Other possible case is where right pointer stops at j, but left pointer exceeds i. This is not possible because at every increment of left pointer we check if sub-array sum = target value.

Hence, the proof by contradiction, we know that the algorithm will always detect the sub-array with sum = target value, if present in the array.

  1. It is trivial to prove that the algorithm doesn't report false positives, as the algorithm reports that sub-array is found only if sum = target value.

Hence, the proof. It is not a formal or mathematical proof but an intuitive one clear enough to prove the correctness.