Saturday 24 January 2015

SPOJ: DNA SEQUENCES

3408. DNA Sequences

Problem code: SAMER08D

Link to problem :DNA SEQUENCES

Algorithm: It's a standard DP problem longest common subsequence with constraints that subesquence must be formed from common segments of length k or more.
Hence ans[][] must be updated only when segment size is greater than or equal to k.

Tricky test case to discuss:

aggasdfa 
aggakasdfa
ans:8
possible WA: 5
Last 5 characters asdfa forms a common segment, but char a is also a part of segment agga . Hence instead of a segment of length 5, two segment of length 4 each should be considered.
Thus at i=8,j=10 when cont[i][j]=5 we should update lcs[i][j] considering z=4 and z=5, choosing the maximum of both.

Procedure:
1)apply standard lcs() algorithm.
2)update only when segment_length>=k and select the length of common segment for which lcs[i][j] is maximum.

CODE:

#include <cstdio>
#include<string.h>
#include<stdlib.h>
#define N 1009
#include<algorithm>
using namespace std;

int main()
   {
 char s[N], t[N];
    int lcs[N][N], cont[N][N];//lcs is answer matrix and cont is matrix for storing common segment length
     int k,z;
     scanf("%d", &k);
    while( k ){
        scanf("%s %s", s, t);
        int l1=strlen(s);
        int l2=strlen(t);
        cont[0][0] = 0;
        for(int i = 0; i < N; ++i) lcs[i][0] = lcs[0][i] = 0;
        for(int i = 1; i <= l1; ++i)
            for(int j = 1; j <= l2; ++j)
            {
                lcs[i][j] = max(lcs[i - 1][j], lcs[i][j - 1]);
                if(s[i-1]==t[j-1])
                cont[i][j]=cont[i-1][j-1]+1;
                else 
                cont[i][j]=0;
                //update lcs matrix if segment size is greater than or equal to k
                if(cont[i][j] >= k)
                for(z= k; z <= cont[i][j]; z++)
                    lcs[i][j] = max(lcs[i][j], lcs[i - z][j - z] + z);
            }
        printf("%d\n",lcs[l1][l2]);
        scanf("%d",&k);
    }
    return 0;
}

6 comments:

  1. what is its time complexity?

    ReplyDelete
    Replies
    1. Time complexity means the amount of time taken by your program to execute...
      hope it helps
      Happy coding.....!

      Delete
  2. it will give TLE at worst case if n=1000 and k=100;

    ReplyDelete