0
votes

Given n strings. For q queries consisting of l and r, I should output the LCP (longest common prefix) for all pairs of strings in the sequence [l, r].Is there any data structure (Segment tree, Fenwick...) that can help with this? How can I precalculate anything here having in mind that both n and q are <= 10^5. The sum of all lengths of strings is <= 10^5?

Except the brute force solution I have no other ideas...

1

1 Answers

0
votes

Suppose you are given some strings. First, you need to sort them.

1. hello    |  1. broad
2. wi-fi    |  2. brother
3. brother  |  3. hell
4. sister   |  4. hello
5. broad    |  5. siam
6. siam     |  6. sial
7. sial     |  7. sister
8. sit      |  8. sit
9. hell     |  9. wi-fi

Then you create an array with indices to know position of every string in the sorted array.

index in orignal - 1 2 3 4 5 6 7 8 9
index in sorted  - 4 9 2 7 1 5 6 8 3

Now for every query, you need to find minimum and maximum indices in the sorted array. And based on this two indices find LCP. See examples.

Ex. 1

l,r = 1,5

Corresponding indices in the sorted arrays are 4 9 2 7 1. Min and max indices are 1 and 9. So words broad and wi-fi are in the query, and LCP is the empty string.

Ex. 2

l,r = 6,8

Corresponding indices in the sorted arrays are 5 6 8. Min and max indices are 5 and 8. So words siam and sit are in the query, and LCP is si.

Complexity

  1. Sort all the strings. To correctly estimate complexity here you need to properly model possible input.

  2. For every query you need to make linear time search for max and min indices and then find LCP for two strings.