Watch it ...

The Gateway to Computer Science Excellence

+14 votes

Give an optimal algorithm in pseudo-code for sorting a sequence of $n$ numbers which has only $k$ distinct numbers ($k$ is not known a Priori). Give a brief analysis for the time-complexity of your algorithm.

+14 votes

+8

" *Counting sort* is a *sorting* technique based on keys between a specific range. It works by *counting* the number of objects having distinct key values (kind of *hashing*)." source GFG

also counting sort needs key values within some range . This assumption is not given in the question !!

But somehow i think O(n+k) is time complexity of correct algorithm .

+1

I think by saying O(n+k) he means ... 'k' is the maximum value.... but here already one k is there.. So

counting sort takes O(n+m)....where m is the maximum value..... but still I do not know whats the answer.

counting sort takes O(n+m)....where m is the maximum value..... but still I do not know whats the answer.

+1

As here k is given in question, i finally go with counting sort with O(n+k) time complexity.

As the definition of counting sort says "Counting sort is an algorithm that takes an array A of n elements in the range {1, 2, ..., k} and sorts the array in O(n + k) time " reference MIT slide .

An extra array ( auxiliary array with k elements) maintain by counting sort .

As the definition of counting sort says "Counting sort is an algorithm that takes an array A of n elements in the range {1, 2, ..., k} and sorts the array in O(n + k) time " reference MIT slide .

An extra array ( auxiliary array with k elements) maintain by counting sort .

+1

Yes..Now I agree with the answer....but here in qsn k means something else. So should not a different variable be used like m instead of k? @Bikram sir.

+1

@ashwan

No , here k means different kind of elements among n elements . Also counting sort definition says the same "Counting sort is an algorithm that takes an array A of n elements in the range {1, 2, ..., k} " here range means different type of elements like 1,2,3,..,k ..k different elements .

So answer is O(n+k)

No , here k means different kind of elements among n elements . Also counting sort definition says the same "Counting sort is an algorithm that takes an array A of n elements in the range {1, 2, ..., k} " here range means different type of elements like 1,2,3,..,k ..k different elements .

So answer is O(n+k)

+2

@bikram sir, So this is not a possible input sequence here? 5,100,1000,5,100,10000 where k=4 distinct elements but maximum value=10000

+1

@ahwan

Yes, this is possible sequence.

This question indirectly ask about counting sort..see the definition of counting sort an this question, you will understand, you may read that MIT slide..google it

Yes, this is possible sequence.

This question indirectly ask about counting sort..see the definition of counting sort an this question, you will understand, you may read that MIT slide..google it

+1

@Bikram sir.I don't agree with the selected answer.Here k is not the range so how can we use counting sort?

If my array is :- 1,2,3,1000.

Now my k=4 as 4 distinct numbers but maximum number is 1000. K can be 1 or it can be n(total elements),so it is a general case ,array can have 1 distinct element or it can have n distinct elements and it has nothing to do with range.So i will suggest answer as quicksort here as it is the most practical used algorithm

If my array is :- 1,2,3,1000.

Now my k=4 as 4 distinct numbers but maximum number is 1000. K can be 1 or it can be n(total elements),so it is a general case ,array can have 1 distinct element or it can have n distinct elements and it has nothing to do with range.So i will suggest answer as quicksort here as it is the most practical used algorithm

0

@ahwan

1 to k here index number of the array. That may be array element or maynot. That is not told in question

1 to k here index number of the array. That may be array element or maynot. That is not told in question

+1

Counting sort operate on k empty lists

but in this question asking about only k elements , then how counting sort will be answer?

Ref here

+1

counting sort is applicable when we already know the range of the values of those elements which we have to sort but given question is not giving any information related to it . Then how counting sort is applicable here. anyone please explain.

0

last part of counting sort not clear to me. Can u explain me once how do we reducing counting sort array again from $n$ elements to $k$ elements?

Why range is important? If we know the value of $n$ and $k$, we can apply counting sort here. Isit not??

0

can this not be done in o(klogk + n) time, using perfect hashing and any comparison sort algorithm? We hash all the n numbers, storing count of the numbers during hashing. Then sorting only the k distinct numbers using any comparison sort algorithm, after getting the sorted array, then recreating the whole sorted array from the hash using the count stored.

0

We need to know the range in counting sort. Reference: https://stackoverflow.com/questions/45489527/what-is-k-in-counting-sort-onk-time-complexity

I think we can use your approach if we know the values of these 'k' distinct elements in prior.

As in the question they have given 'k' is not known prior, I assume that the values of these 'k' distinct elements is also not known in prior. I think the best approach here would be to use any comparison based sort that gives complexity as O(n log n).

0

@rohith1001, no we don't need to know the k distinct elements in prior, that's why we are using perfect hashing.

+3 votes

+1

Even if we will use 'n' slots in hashing..... that mod n can send all value to one slot...irrespective of what is the value of k... So hashing should not be used @Bikram Sir.

0

yes correct. hashing can not be used here.

Size for hash Table is n , it depends upon slots.

O(n) is the answer here .

Here k<= n . worst case k = n possible.

Size for hash Table is n , it depends upon slots.

O(n) is the answer here .

Here k<= n . worst case k = n possible.

+1

@Bikrm Sir, how it can be done in O(n) ? No comparison based sorting can perform better than O(nlogn)..

Only counting, radix, bucket can do that..

But they are used for special cases .

For example counting can be used when the elements are in small range only.

So answer is O(n+m) where m is the maximum value.

So how it can be done in O(n)?

Please help.

Only counting, radix, bucket can do that..

But they are used for special cases .

For example counting can be used when the elements are in small range only.

So answer is O(n+m) where m is the maximum value.

So how it can be done in O(n)?

Please help.

0

yes, no comparison based sorting can perform better than O(nlogn).. but here in worst case you have to check all n elements right ? that's why O(n)

and in your answer from where ' m ' comes ? it is not mention in question.

say numbers are in this sequence given 1,3,6,3,6 then n=5 and k =3 so what is 'm 'here ? do you mean m= 6 here ?

and in your answer from where ' m ' comes ? it is not mention in question.

say numbers are in this sequence given 1,3,6,3,6 then n=5 and k =3 so what is 'm 'here ? do you mean m= 6 here ?

+1 vote

We can write code like this

int distinctValue(int a[ ] , int n) { int k=0,value[100],distinct[100]; for(i=0 ; i<10 ; i++) { if(max<value[i]) max=value[i]; } for(i=0; i<max; i++) { distinct[value[i]]=value[i]; } for(i=0; i<max; i++) { distinct[i]=distinct[i-1]+distinct[i]; } return 0; }

So, complexity will be $O\left ( n \right )$

+2

@ sreshta it is not sorted. there is a sequence of n numbers which has only k distinct numbers.

ex: 1,2,1,3,4,2,1,2,3, here n = 9 & k = 4

ex: 1,2,1,3,4,2,1,2,3, here n = 9 & k = 4

+1

Answer must be O(n+k) as for any k distinct numbers k always < and equal to ( in worst case ) n .

By using counting sort it gives O(n+k)

But elements are not sorted , we need to make a sorted sequence.

By using counting sort it gives O(n+k)

But elements are not sorted , we need to make a sorted sequence.

0

Please see this.

Here k is not the range.so how will the value of k affect the time complexity?

**Counting sort** is more efficient when we have single digits(otherwise it uses large space even for small values of n and the value of k in O(n+k) increases.)

For numbers with more than 1 digit we use radix sort..

the maximum value can also be n^3 ,like 01,01,27.So there are k distinct values`.Now the **maximum number of digits** is log m(base 10).[m is the maximum value in the sequence].

so using **radix sort** we could actually use log m(n+constant)=>n log m

now the time complexity depends on the highest number. if m=n^2 then it is O(n logn)

if m=2^n then it is O(n^2)

if m=n^n then it is O(n^2 log n)

**Then the time complexity depends on the highest value.**

So we could actually use **mergesort /****heapsor**t which gives O(n log n)every time.

0

Yes modification required here.

Can u tell me Counting sort in algo format, because when I was writing the code getting difficult in code.

Can u tell me Counting sort in algo format, because when I was writing the code getting difficult in code.

0

More than one digit case , also counting sort works. but it mainly works for repetition of same digits. right??

0

See this code snippet from CORMEN.

**O(n+k) is the time complexity .It takes O(k ) time to calculate the cumulative sum.(steps 7,8,9). **

It works for numbers having any number of digits and for any value of n.

But it is not very efficient when we consider numbers where the value of n is less but the value of k is large and the difference between the numbers is large .like {1,4,5,50,3125} here the size of C will be 1000 even though the number of terms is 5...

Then time complexity in the above case is (n+k)here k=n^5 hence the time complexity of counting sort would be O(n^5).which is large.

**Counterexample** :{2,2,2,2,2,2,2,2,2,10000} here n=10,k=10000

Here 2 is repeated many times and only 1 time we will have 10000 but still, the size of C will be 10000 which is O(n^4)

and the time complexity will be O(n^4).

**Again counting sort works well when the values are near to each other.(or concentrated within a specified range and the range is small).**

52,345 questions

60,497 answers

201,862 comments

95,319 users