2
votes

I have a problem in understanding the node structure for the calculation of the complete longest increasing subsequence (lis) in the paper "Heaviest Increasing/Common Subsequence Problems" by Jacobson and Vo.

Here is the pseudo code from the paper:

Algorithm[1]

What is meant with

node is an auxiliary array that, for each element in L, contains a record of an element that precedes this element in an increasing subsequence. The function newnode() constructs such records and links them into a directed graph. At the end of the algorithm, we can search from the maximal element of L to recover an LIS of sigma.

? How would you implement this structure?

Do I have to construct a directed graph with all the elements of the sequence as vertices (plus a nil-vertex) and edges "\sigma_i -> s" and then search for the longest path starting at the maximal element of L (and ending at nil)? Isn't there a more effective way to get the complete lis?


My second question: Is this algorithm as fast as the algorithm described in wikipedia? If not: Can I modify the algorithm from wikipedia as well to calculate heaviest common subsequences as described in the paper?

1
Can someone include the picture? I don't have enough reputation.Popov Florino

1 Answers

0
votes

I'd implement the array with an array and the graph as a singly-linked list with structure sharing. To wit, in generic C++,

#include <algorithm>
#include <iostream>
#include <memory>
#include <utility>
#include <vector>

template <typename T>
struct Node {
  explicit Node(const T& e) : element(e) {}

  T element;
  std::size_t index = 0;
  Node* previous = nullptr;
};

template <typename T, typename Compare>
std::vector<T> LongestIncreasingSubsequence(const std::vector<T>& elements,
                                            Compare compare) {
  if (elements.empty()) {
    return {};
  }
  std::vector<std::unique_ptr<Node<T>>> node_ownership;
  node_ownership.reserve(elements.size());
  std::vector<Node<T>*> tableau;
  for (const T& element : elements) {
    auto node = std::make_unique<Node<T>>(element);
    auto it = std::lower_bound(tableau.begin(), tableau.end(), node.get(),
                               [&](const Node<T>* a, const Node<T>* b) {
                                 return compare(a->element, b->element);
                               });
    if (it != tableau.begin()) {
      auto previous = it[-1];
      node->index = previous->index + 1;
      node->previous = previous;
    }
    if (it != tableau.end()) {
      *it = node.get();
    } else {
      tableau.push_back(node.get());
    }
    node_ownership.push_back(std::move(node));
  }
  Node<T>* longest = *std::max_element(
      tableau.begin(), tableau.end(),
      [](Node<T>* a, Node<T>* b) { return a->index < b->index; });
  std::vector<T> result(longest->index + 1);
  for (; longest != nullptr; longest = longest->previous) {
    result.at(longest->index) = longest->element;
  }
  return result;
}

int main() {
  for (int x : LongestIncreasingSubsequence(
           std::vector<int>{3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8, 9, 7, 9, 3},
           std::less<int>())) {
    std::cout << x << '\n';
  }
}

If you're fortunate enough to be working in a language with garbage collection, you can ignore the business with node_ownership and std::move.

Here's a Python version.

import bisect


def longest_increasing_subsequence(elements):
    elements = list(elements)
    if not elements:
        return []
    # Build the tableau
    tableau_elements = []
    tableau_indexes = []
    predecessors = []
    for i, element in enumerate(elements):
        j = bisect.bisect_left(tableau_elements, element)
        predecessors.append(tableau_indexes[j - 1] if j > 0 else None)
        if j < len(tableau_elements):
            tableau_elements[j] = element
            tableau_indexes[j] = i
        else:
            tableau_elements.append(element)
            tableau_indexes.append(i)
    # Find the subsequence lengths
    lengths = []
    for i, predecessor in enumerate(predecessors):
        lengths.append(1 if predecessor is None else lengths[predecessor] + 1)
    # Extract the best subsequence
    best = max(range(len(lengths)), key=lambda i: lengths[i])
    subsequence = []
    while best is not None:
        subsequence.append(elements[best])
        best = predecessors[best]
    subsequence.reverse()
    return subsequence


print(longest_increasing_subsequence([3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8, 9, 7, 9, 3]))