Subarray Sum Solution

Question

I recently came across a solution for the subarray sum problem with two pointers and I'm not really sure about the correctness of the algorithm. The subarray sum problem consists of finding a subarray whose sum is equal to some target value. I've seen solutions of this problem using a hash table but I question the correctness of this other solution. Let me give you guys an example of the algorithm.

Suppose you have an array [1,3,2,5,1,1,2,3] and a target value x = 8.

Use left and right pointers that indicate the beginning and end of the subarray. On each step the left pointer moves one step forward and the right pointer moves forward as long as the subarray sum is at most x. This is how the algorithm will progress for the above array.

[1,3,2,5,1,1,2,3]
l    r

[1,3,2,5,1,1,2,3] ##The right pointer doesn't move because the next element would make the sum larger than x
   l r

[1,3,2,5,1,1,2,3]
     l   r

The sum is now equal to the target and the algorithm ceases. To me the algorithm doesn't seem to be valid for edge cases. But it looks right for every test case i give it and it runs O(n). Can someone give me a proof of correctness on this?

Thanks

edit - For the sake of argument assume positive integers.

If you are interested in proof of correctness. please let the community know what have you tried. Asking for whole proof of correctness sounds broad question. — Shridhar R Kulkarni
To me the algorithm doesn't seem to be valid for edge cases. What are those edge cases? — Shridhar R Kulkarni

Matt Timmermans Matt Timmermans · Accepted Answer · 2019-09-21T17:04:49

Let LEN(i) be the length of the longest subarray starting at index i with sum at most X

If there is a subarray with sum X that starts at index i, then LEN(i) will be the length of such a subarray (there may be multiple such arrays with trailing 0s). Since there are only positive integers in the array, all longer arrays starting at i will have a greater sum, and all shorter ones will have an equal or lesser sum.

So all we need to do is find LEN(i) for each index and the sum of the corresponding subarray. If one of those sums is X then you have the answer.

Consider the LEN(i) subarray for any index. If it's non-empty and we remove the first item, then the resulting subarray, starting at i+1, will have a lesser of equal sum. Therefore LEN(i+1) >= max(LEN(i-1),0).

We can rewrite that as LEN(i+1) >= max(LEN(i),1)-1, and this is the fact that the two pointer algorithm uses to achieve O(n) time.

We start by setting the left and right pointers to the start and end of the max(LEN(0),1) subarray at 0, and check its sum. Then we move the left pointer up and we know from the above equation that right pointer can only move forward, so we move it out to the end of the max(LEN(1),1) subarray, and check its sum.

We proceed to the end of array checking the sum of the corresponding subarray starting at every index.

Subarray Sum Solution

2 Answers