0
votes

I recently had an interview question as follows: We are given a list of words, and we want to format them to maximize the # of carriage returns while keeping the # of letters per line within a range.

For example, we want a range of 5 - 10 (inclusive) letters per line, one solution is this:

hello (5)
cat (3)

Another is this:

hello cat (9) <- 1 for a space

So the first solution is better because we have 1 carriage return versus 0 in the second.

If a word does not fit, it must be placed on a new line. For example:

hello (5)
people (6)

Intuitively to me this seems like a greedy algorithm problem where we return as soon as we meet the minimum letter per line constraint. However, this seems too simple and I'm now starting to doubt myself, but I can't come up with a counter example where greedy is not the best.

1
Why do you have the minimum range (the 5 in 5-10) if you allow "cat" (three letters) to be in its own line? - justhalf
I didn't consider that... I assumed the last word is not under those constraints. But if it is, then greedy will probably not work... - DillPixel
I guess there should be no minimum in the range. Otherwise this case is insolvable: "bullfrog eats bullfrog". The "eats" won't be able to be paired up with any word without violating the maximum 10 characters. Unless the input is constrained to always have a solution given the range condition. - justhalf

1 Answers

1
votes

If the words are to be placed in the same order as they appear then a simple greedy approach shall be optimal because there is no reason why you wouldn't place the carriage returns as early as possible in the sequence.

If you are allowed to change the order of the words, then it is a more difficult problem and then following approach can be applied.

Sort the words in descending order of the number of letters.

Assign one line each to words of length >= 5.

For words of length < 5, it is a reverse multiple bins bin backing problem wherein:
The bins have minimum capacity 5 and maximum capacity 10.
You have to place the words in the bins such that the number of bins is maximized.

It is at least an NP complete problem but "I think" (more so because it was asked in an interview) a dynamic programming formulation can be thought of that solves it in pseudo-polynomial time (like a knapsack problem).

EDIT: IMO the greedy algorithm shall work in the case when the maximum capacity is at least equal to twice the minimum capacity, like in this case.