Print Shortest Common Super-sequence (SCS)

Posted by on Mar 31, 2024

Given two strings "s1" and "s2", the task is to print the shortest string that has both "s1" and "s2" as subsequences.

If multiple shortest super-sequence exists, print any one of them.

Input: s1=abcd, s2=xycd Output: abxycdc

Input: s1=AGGTAB, s2=GXTXAYB Output: AGGXTXAYB


Solutions

Method 1: Memoization

The Memoization Technique is basically an extension to the recursive approach so that we can overcome the problem of calculating redundant cases and thus decrease time complexity.

We can see that in each recursive call only the value of "n" and "m" (length) changes, so we can store and reuse the result of a function(..n, m..) call using a "n * m" 2-D array.

The 2-D array will store a particular state (n, m) if we get it the first time.

Now, if we come across the same state (n, m) again, instead of calculating it in exponential complexity, we can directly return its result stored in the table in constant time.

Complexity

The time complexity of this solution is (n * m).

In addition, O(n * m) auxiliary space was used by the table.

Method 2: Recursion

This problem is a variation of the Shortest Common Super-sequence (SCS).

The idea is to generate all super-sequences of both given sequences and print the shortest among them.

Start from the last elements (n-1, m-1) of both subsequences:

If s1[n-1] == s2[m-1], this character will be common to both subsequences, so add it to the current super-sequence and check for the remaining (0...n-2, 0...m-2) characters.

If s1[n-1] != s2[m-1], then check recursively for (0...n-2, 0...m-1) and (0...n-1, 0...m-2) and return [s1.charAt(n - 1) or s2.charAt(m - 1)] + shortest sequence of these two. This +1 character is to consider non-matching character of non-picked sequence.

If any of the subsequences is exhausted (m ==0 or n ==0), add remaining characters of the other sequence to the super-sequences. This is the base condition.

Complexity

This solution is exponential O(2^(m+n)) in term of time complexity.

In addition, O(n) auxiliary stack space was used for the recursion stack.

Related


What is "Subset Sum Problem" ?

What is the "Rod Cutting Problem" ?

Equal subset sum partition problem

Minimum number of deletions and insertions

Longest Common Subsequence (LCS)

Count number of subsets with given sum

Unbounded knapsack (Recursion & Dynamic Programming)

Print Longest Common Subsequence (LCS)

Find length of the Longest Common Substring

Shortest Common Super-sequence (SCS)