1,944 views
1 votes
1 votes
Given an input string of length $n$, find the maximum length of the substring containing maximum $k$ unique characters. For example, for "abbcdaadcd" and $k=2$ answer will be 4 for the substring "daad".

4 Answers

3 votes
3 votes
#include<stdio.h>
    
    
    void kunique(char s[], int k)
        {
        int i,j,n,m,u=0;
        int last=0,current=0,flag=0;
        int maxlength=0;
        for(i=0;i<n;i++)
    
            {
            for(j=0;j<i;j++)
                {
                if(s[i]==s[j]) //if in the time of 2nd for loop any element of 1st for loop matches  
                   continue;   // then skipping that iteration and continue with next value of j
                else{
                    if(u<(k-1)) //when both charecter is different or unique (u) and u less than maximum 
                        if(j<=(i-1)) //window size then increasing u value and for whole j loop iteration
                        flag=1;//if no charecter matches with a[j], only then u will increase
                    else
                    {
                      last=(i-1); //when u>=(k-1) i.e. unique charecter more than window size , then last charecter in the window is set
    
                      if((current-last)> maxlength)//if current window size is previous window size then we update maxlength
                      maxlength=(current-last);
    
                        for(m=current;m<last;m++)
                        {
                        	if(s[current]=s[m])//here we are updating current pointer i.e. 
                            current=m++;//previously current pointer poining to first charecter of "abcabcdeef"
                        	           //now it is pointing to first charecter of "bcdeef"
                        	
                        	u--;//number of unique charecter reduces by 1 , as window shifted here
                        	
                        }
                    }
                }
            if(flag==1)
            u++;
            printf("%s",s[current-last]);
        }
    int main()
        {
        char s[20]="abcabcdeef";
         int k=3;
        kunique(s[],k);
        return 0;
        }
edited by
2 votes
2 votes

O(kn), if k is constant

=O(n)

Algorithm working->

1. Move in the string from 0th index,. every time next character is encountered then increment index of A[] and add that character and increment the values of B[] (that is counting no. of repeated characters to same index value and distinct character to next index).

2.To find the max span for k unique character. After finding k distinct characters from starting, all values of B array are added to find length of the span and that value is compared with the maximum value returned by remaining array ( i.e. value returned by max_span(i+1,k,j)). Elements which we are considering in the same span have same span_no[] value so that we can add B[] array values for those indexes to find length of the span.

3. Give span_no[] array same value for the indexes which are in same span. now when we move to next span, add B[] values for the indexes which have same span_no[] values and move to next span. But here, add the last element of previous span in the next span ex - "aaaabbaccccc" for 2 unique characters, this string will break to 2 spans "aaaabba" and "accccc" so for next span we will assign span_no[] another (unique) value for all elements in that span.

Taking help of 3 arrays

A[] -> to store all unique characters in an array, array size is in worst case = number of characters in string ( i.e. if all consecutive elements are different if "abcdef" then array size will be 6)  if  "aaaabba" then  'a' before 'b' is different from 'a' after b so array will store  A[0]='a', A[1]='b', A[2]= 'a',

B[] -> number of times a distinct character present in the string , for "aaaabba" this will store -> B[0] =4 ,B[1] = 2,B[2] = 1

span_no[] ->   first span possible is "aaaabba",span_no[] will help to tell the elements which are present in current span, so span_no will store -> span_no[0]=0, span_no[1]=0, span_no[2] =0.

int len;            // number of unique characters

char A[n];

int B[n]; //<- initialize them to 0                               

int span_no[n], j=0;

int max_span(int i,int k, int j)    // i = current index  in string,  k = remaining distinct characters in this span, j= index position for contracted array
{
    if(i==0)
        {
            A[j]=string[i];
            B[j]++;
            return max_span(i+1,k-1,j);
            }
    else if (i==n)
        return 0;
    else if(string[i] == string[i-1])
        {
            A[j]=string[i];      B[j]=+1;
            return max_span(i+1,k,j);
        }
    else              //then search for the term which is equal to this character in current span in O(k)
    {
        ++j, B[j]+=1; A[j]=string[i];
        int temp_i =j-1,found=0; ++j;
        while((temp_i!=0)&&(span_no[ j ]==span_no[temp_i]))            // if that span contains same character earlier also then count this, element in the same span otherwise it will be taken as distinct element in that case if distinct character length is full (i.e. k==0), then add this element in next span
        {
            if(A[ j ]==A[temp_i])
            found=1;  --temp_i;
        }
        if(found==1 && k!=0)                                                        // if found then add to that span only
        {
            span_no[j]=span_no[j-1];
            max_span(i+1,k,j);
        }
        else if(!found && k==0)                                            //if not found and k is reached 0 then add to next span
            {
                temp_j=j-1; add=0;
                while(span_no[--temp_j]==span_no[j-1])    //count all characters in this span and move to next span
                {
                    add+=B[temp_j];
                }  
                span_no[j-1]+=1;                                //include last element also in the next span        
                span_no[j]=span_no[j-1];
                int u=max_span(i+1,len,j);  
                return max(add , u);
            }
            else                                            //if not found and distinct character count is remaining in that span then add to the current span only
                {
                    span_no[j]=span_no[j-1];
                    max_span(i+1,k-1,j);
                }
    }
}
edited by
0 votes
0 votes

Please see here for a similar but simpler one for detailed explanation. 

So, here we are interested in a substring which means all the interested characters must be continuous. So, this makes a dynamic programming problem formulation easy. Consider the array index $i$ and let $S[i-1[$ be the maximum length with $k$ unique characters  $i-1$. Also let $B[1..k]$ and $C[1..k]$ denote two circular queue of length $k$ containing the last $k$ unique characters and their respective starting indices. Now, 

$S[i] = \begin{cases} S[i-1] + 1 \text{if } B \text{ is not full or } S[i] \text{ in } B \\S[i-1] +1 - C[1] \text{ otherwise } \end{cases}$

We also have to do 

  1. For case 1: if(S[i] == S[i-1]) C[k] ++; where $k$ is the index of $S[i]$ in array $C$ and
  2. For case 2: dequeue(C); dequeue(B); enqueue (B, S[i]), enqueue (C, 1)
0 votes
0 votes
#include<stdio.h>
#define MAX_CHAR 26 /* Here, we have assumed string can only have characters from a-z, i.e. 26 characaters. */

int w[1000]; /* stores max substring length with atmost k unique characters at each element likely to be seen in future. */
int count[MAX_CHAR]; /* keeps count of several characters. */
void kUnique(char[], int, int); /* function that finds max length of substring with k unique characters. */

int main() {
	char a[1000];
	int i, n, k;
	scanf("%d", &n); /* enter length of string*/
	scanf("%s", a); /* enter string */
	scanf("%d", &k); /* enter no. of unique characters to be taken under consideration */
	kUnique(a, n, k);
	return 0;
}

void kUnique(char a[], int n, int k) {
	int i, j, cur, lastSeen, window, temp, msf;
	
	for(i=0; i<n; i++) {
		if(count[a[i]-'a']==0) {	
		    window++;
		}
		count[a[i]-'a']++;
	}
	if(window<k) {
		printf("Not enough unique characters"); 
		return; 
	}
	/* The code segment from line 44-53 will detect if no. of unique characters in the string is less than k. If yes, it will print ERROR msg and return. */
	
	for(i=0; i<n; i++) {
		for(j=0; j<MAX_CHAR; j++) { /* At beginning of every loop, we set every element of count to 0. */
			count[j] = 0;
		}
		window = 0; /* At the beginning of each loop , wet window to 0 because we have not seen any substring yet. */
		temp = k; /* Also, set temp to 0 because we have not seen any substring yet */
		for(j=i; temp>=0 && j<n; j++) {
			if(count[a[j]-'a']==0) { 
				temp--;
				if(temp==-1) {
				    break;
				} else {
					count[a[j]-'a']++;	
				}
			} else {
			    count[a[j]-'a']++;
			}
			window++;
		}
		w[i] = window;
	}
	/* Whenever, we see a new character not seen till now, the corresponding value of count will be zero. Then, we will simply decrement temp.
		Now, if temp becomes -1, then it means that we have seen k unique characters, We then simply break out from the loop. Also, window is used to keep track of length of substring.
		After coming out of the inner (jth) lop, we set w[i] = window where window is the length of the substring seen just now.*/
	
	msf = w[0];
	for(i=1; i<n; i++) {
		if(w[i]>msf) {
			msf = w[i];
		}
	}
    /* At the end, we basically search entire w array to find and print w[i]  where w[i] is the max value and denotes the length max substring with k unique characters */ 
	printf("%d", msf);

}

TIME COMPLEXITY

In worst case, for every a[i], we can search from a[i,.., n] to find sub string  with k unique characters. As a result, the time complexity will be O(n^2).

SPACE COMPLEXITY

The space complexity of above program is O(n+26) = O(n) as we use an extra array count and w (to keep track of length of max sub string with k unique characters). However, we can remove w and simply keep track of  max substring length with k unique characters seen till now. As a result, we will then need an extra space of O(26) or O(256) (if we consider that any of 256 ASCII characters may be present in the input string).

Related questions

1 votes
1 votes
1 answer
1
radha gogia asked Apr 10, 2016
2,032 views
According to me first we sort the array in O(nlogn) time and then in O(k) time , find the product , so total time complexity is O(nlogn) , so am I right or can it be done...
2 votes
2 votes
2 answers
2
Arjun asked Jul 3, 2016
2,977 views
Given an array of $n$ elements find the maximum continuous sum in it. For example consider the below array of $n=6$.23 4 -10 2 15 1Answer is 35.
1 votes
1 votes
0 answers
3
Arjun asked Jun 6, 2016
449 views
Write an object oriented code for representing boolean expressions and then a function for checking the equivalence of two boolean expressions.
0 votes
0 votes
0 answers
4
Arjun asked Jun 6, 2016
1,053 views
Given an arithmetic expression involving *, + only write an object oriented code for its representation and evaluation