Finding number of subarrays whose sum equals `k`

Question

We would be given an array of integers and a value k. We need to find the total number of sub-arrays whose sum equals k.

I found some interesting code online (on Leetcode) which is as follows:

public class Solution {
    public int subarraySum(int[] nums, int k) {
        int sum = 0, result = 0;
        Map<Integer, Integer> preSum = new HashMap<>();
        preSum.put(0, 1);

        for (int i = 0; i < nums.length; i++) {
            sum += nums[i];
            if (preSum.containsKey(sum - k)) {
                result += preSum.get(sum - k);
            }
            preSum.put(sum, preSum.getOrDefault(sum, 0) + 1);
        }

        return result;
    }
}

To understand it, I walked through some specific examples like [1,1,1,1,1] with k=3 and [1,2,3,0,3,2,6] with k=6. While the code works perfectly in both the cases, I fail to follow how it actually computes the output.

I have two specific points of confusion:

1) Why does the code continuously add the values in the array, without ever zeroing it out? For example, in case of [1,1,1,1,1] with k=3, once sum=3, don't we need to reset sum to zero? Doesn't not resetting sum interfere with finding later subarrays?

2) Shouldn't we simply do result++ when we find a subarray of sum k? Why do we add preSum.get(sum-k) instead?

River River · Accepted Answer · 2017-07-12T21:50:49

Let's handle your first point of confusion first:

The reason the code keeps summing the array and doesn't reset sum is because we are saving the sum in preSum (previous sums) as we go. Then, any time we get to a point where sum-k is a previous sum (say at index i), we know that the sum between index i and our current index is exactly k.

For example, in the image below with i=2, and our current index equal to 4, we can see that since 9, the sum at our current index, minus 3, the sum at index i, is 6, the sum between indexes 2 and 4 (inclusive) is 6.

Another way to think about this is to see that discarding [1,2] from the array (at our current index of 4) gives us a subarray of sum 6, for similar reasons as above (see image for details).

Using this method of thinking, we can say we want to discard from the front of the array until we are left with a subarray of sum k. We could do this by saying, for each index, "discard just 1, then discard 1+2, then discard 1+2+3, etc" (these numbers are from our example) until we found a subarray of sum k (k=6 in our example).

That gives a perfectly valid solution, but notice we would be doing this at every index of our array, and thus summing the same numbers over and over. A way to save computation would be to save these sums for later use. Even better, we already sum these same numbers to get our current sum, so we can just save that total as we go.

To find a subarray, we can just look through our saved sums, subtracting them and testing if what we are left with is k. It is a bit annoying to have to subtract every saved sum, so we can use the commutativity of subtraction to see that if sum-x=k is true, sum-k=x is also true. This way we can just see if x is a saved sum, and, if it is, know we have found a subarray of size k. A hash map makes this lookup efficient.

Now for your second point of confusion:

Most of the time you are right, upon finding an appropriate subarray we could just do result++. Almost always, the values in preSum will be 1, so result+=preSum.get(sum-k) will be equivalent to result+=1, or result++.

The only time it isn't is when preSum.put is called on a sum that has been reached before. How can we get back to a sum we already had? The only way is with either negative numbers, which cancel out previous numbers, or with zero, which doesn't affect the sum at all.

Basically, we get back to a previous sum when a subarray's sum is equal to 0. Two examples of such subarrays are [2,-2] or the trivial [0]. With such a subarray, when we find a later, adjoining subarray with sum k, we need to add more than 1 to result as we have found more than one new subarray, one with the zero-sum subarray (sum=k+0) and one without it (sum=k).

This is the reason for that +1 in the preSum.put as well. Every time we reach the same sum again, we have found another zero-sum subarray. With two zero-sum subarrays, finding a new adjoining subarray with sum=k actually gives 3 subarrays: the new subarray (sum=k), the new subarray plus the first zero-sum (sum=k+0), and the original with both zero-sums (sum=k+0+0). This logic holds for higher numbers of zero-sum subarrays as well.

Finding number of subarrays whose sum equals `k`

I have two specific points of confusion:

1 Answers

Let's handle your first point of confusion first:

Now for your second point of confusion: