Wednesday, 30 December 2020

Finding Longest Palindrome in O(n) Using Manacher's Algorithm

You can practice the problem here.

Given a string, your task is to determine the longest palindromic substring of the string. For example, the longest palindrome in aybabtu is bab.

A palindrome here is a string which reads the same backward as forward. We can use functions like reverse to check if the given string is a palindrome or not.

bool is_palindrome(string s) {
    string t = s;
    reverse(t.begin(), t.end());
    return s == t;
}

Or we can use two pointers.

bool is_palindrome(string s) {
    int l = 0, r = (int) s.size() - 1;
    while(l < r) {
        if(s[l] != s[r]) return false;
        else l++, r--;
    }
    return true;
}

The first approach is brute force.

int n = (int) s.size(), start = 0, mx_len = 1;
REP(i, n) {
    REP(j, i) {
        int ok = 1;
        REP(k, (j - i + 1) / 2) {
            if(s[i + k] != s[j - k]) ok = 0;
        }
        if(ok && (j - i + 1) > mx_len) {
            start = i;
            mx_len = j - i + 1;
        }
    }
}
OUT(s.substr(start, mx_len));

However, the time complexity for this solution is O(n ^ 3) because we need three nested loops to find the longest palindromic substring. It only works when n is really small.

The second approach is dynamic programming. The time complexity can be further reduced to O(n ^ 2).

We use dp[i][j] to indicate if s[i] .. s[j] is a palindrome or not. Let's think about the transitions.

  1. If i == j, that means s[i] == s[j], a single character is a palindrome. Example: a.
  2. If i + 1 == j and s[i] == s[j], then s[i] .. s[j] is a palindrome. Example: aa.
  3. If dp[i + 1][j - 1] and s[i] == s[j], then s[i] .. s[j] is a palindrome. Example: abba.

As we can see, dp[i + 1][j] needs to be calculated before d[i][j]. Therefore, we iterate i from n - 1 to 0 and j from i + 1 to n.

int n = (int) s.size();
vvi dp(n, vi(n, 0));
string ans;
int start = 0, len = 1;
REP(i, n) dp[i][i] = 1;
FORD(i, n - 1, 0) {
  FOR(j, i + 1, n) {
    if(s[i] == s[j]) {
      if(i + 1 == j || dp[i + 1][j - 1]) {
        dp[i][j] = 1;
        if(len < j - i + 1) {
          start = i;
          len = j - i + 1;
        }
      }
    }
  }
}
OUT(s.substr(start, len));

With dynamic programming, the time cplexity and auxiliary space are O(n ^ 2). However, we can solve the problem in linear time using Manacher's algorithm.

As a palindrome has a symmetric property at the center position, it could help us to reduce some unnecessary computations. If there is a palindrome of length N centered at position P, we can avoid the comparisions after position P as we already calculated longest palindromic substring at position before P. However, an even palidrome has two centers, which makes the calculate a little bit different than the one calculating for an odd palindrome.

Here's the implementation in C++.

string manacher(string s) {
  int n = (int) s.size();
  // d1[i]: the number of palindromes accordingly with odd lengths with centers in the position i.
  // d2[i]: the number of palindromes accordingly with even lengths with centers in the position i. 
  vector<int> d1(n), d2(n);
  int l1 = 0, r1 = -1, l2 = 0, r2 = -1, mx_len = 0, start = 0;
  for (int i = 0; i < n; i++) {
      // ----------------------
      // calculate d1[i]
      // ----------------------
      int k = (i > r1) ? 1 : min(d1[l1 + r1 - i], r1 - i + 1);
      while (0 <= i - k && i + k < n && s[i - k] == s[i + k]) k++;
      d1[i] = k--;
      if (i + k > r1) l1 = i - k, r1 = i + k;
      if(d1[i] * 2 > mx_len) start = i - k, mx_len = d1[i] * 2 - 1;
      // ----------------------
      // calculate d2[i] 
      // ----------------------
      k = (i > r2) ? 0 : min(d2[l2 + r2 - i + 1], r2 - i + 1);
      while (0 <= i - k - 1 && i + k < n && s[i - k - 1] == s[i + k]) k++;
      d2[i] = k--;
      if (i + k > r2) l2 = i - k - 1, r2 = i + k;
      if(d2[i] * 2 > mx_len) start = i - k - 1, mx_len = d2[i] * 2;
  }
  // return the longest palindrome
  return s.substr(start, mx_len);
}

If you want to count how many palindromic substrings in the given string, simply sum d1[i] and d2[i] where i = 0 .. n - 1.

int cnt = 0;
for(int i = 0; i < n; i++) cnt += d1[i] + d2[i];

This problem can also be solved using fast LCA in O(n) or String Hashing O(nlogn). These methods will not be discussed in this post.

You can find the whole solution here.

No comments:

Post a Comment

A Fun Problem - Math

# Problem Statement JATC's math teacher always gives the class some interesting math problems so that they don't get bored. Today t...