0
votes

Given an array a[0..N-1] of integers between 0 < N < 10^5, find the size of the longest most repeated subarray in optimal time and space. I was thinking of using a HashMap to store the indexes where a number begins, and start checking for each index the possible subarrays it can form. But this seems inefficient, so I would like to hear some opinions of how I can approach this problem.

Example:

Input a = [0,1,2,1,4,1,2,1,0,5]
Expected output: Most Repeated: [1,2,1]; Size: 3

2
1) In which language? 2) You can't really ask for "optimal time and space", because that's typically the trade-off. 3) How should we prove it's "optimal"? 4) Is this an actual problem, or a puzzle (in the latter case, it should probably be in code-golf)? 5) What is your code so far? 6) What does "it seems inefficient" mean, if you have a working solution, why does it need improvement? - LWChris
1) Java. 2)I know, but the best possible relation between these two in this specific problem 3) Executing in the lowset possible complexity 4) It is an actual problem, but it´s not there 6) I just want to see if there is another way to approach this - Santiago Fajardo
One could say the expected output is Most Repeated: [1]; Size: 1. - lainatnavi
@lainatnavi, but that won't be the longest. - Harshal Parekh
@Harshal Parekh you're right, sorry, it says longest of course. - lainatnavi

2 Answers

3
votes

Such problems are known to have a naive approach - something that takes a lot of time and performs a brute force search for all the possible solutions.

Obviously, you are not looking for that. Whenever you are in such a situation, the answer is always Dynamic Programming. It is a fairly broad, difficult for beginners and very important in computer science. Which means, I cannot cover it in this answer.

But here is one approach on how you can solve this problem.

Since a common subarray of A and B must start at some A[i] and B[j], let dp[i][j] be the longest common prefix of A[i:] and B[j:]. Whenever A[i] == B[j], we know dp[i][j] = dp[i+1][j+1] + 1. Also, the answer is max(dp[i][j]) over all i, j.

We can perform bottom-up dynamic programming to find the answer based on this recurrence. Our loop invariant is that the answer is already calculated correctly and stored in dp for any larger i, j.

Hope this helps. Good luck.

0
votes

I believe, this problem is equivalent to finding the longest repeated substring, https://en.wikipedia.org/wiki/Longest_repeated_substring_problem. If overlapping substrings are allowed, the problem can be efficiently solved using a suffix tree in linear time. However, in this case it seems that identifying non-overlapping substrings is required, and this approach doesn't work. At least, it can be solved in quadratic time using dynamic programming, https://www.geeksforgeeks.org/longest-repeating-and-non-overlapping-substring/.