I didn't found any analysis to be satisfying, so I am posting this.
The carry generator circuit of 4-bit CLA(for example I have taken) is as below :
Now Below are my equations
$P_i=A_i \oplus B_i$ $,G_i=A_iB_i$
$C_{i+1}=G_i+P_iC_i$
And for Sum $S_i=P_i \oplus C_i$
$C_0$ will be my input carry, and $C_4$ , will be my carry out of MSB of sum bit $S_3$
My Equations for carry generation circuit are:
$C_1=G_0+P_0C_0$
$C_2=G_1+P_1G_0+P_1P_0C_0$
And circuits for them using fan-in of gates to be 2 are below
$C_3=G_2+P_2G_1+P_2P_1G_0+P_2P_1P_0C_0$
And it's circuit is below
So, here my biggest term to do AND with consists of 4 terms, so I Need $log_24=2$ level of AND gates and I have 4 terms to OR together, so similarly 2 level of OR gates.
Now, you might say you have total 4 levels and your total time looks like $\theta(n)$, but before you say this, I ask you to not forget carry $C_4$
$C_4=G_3+P_3G_2+P_3P_2G_1+P_3P_2P_1G_0+P_3P_2P_1P_0C_0$
and it's circuit diagram with gates of fan-in two is below
Now, you see, it's a way bigger term than $C_3$ and here total level is only 5. Why don't levels increase in linear to the number of carry?
The biggest term in $C_4$ to be ANDed together consists of 5 terms and for this you need $\lceil log_25 \rceil=3$ level of AND gates and maximum terms to be ORed together is 5 so 3 level of OR gates.
So, for n bit carry look ahead adder, the biggest carry expression would be of $C_n$ and it would have a total delay of $log_2n(\,for\,AND\,gates\,) + log_2n(\,for\,OR\,gates\,)\,=2log_2n$
Okay, so my carry generation circuit would take $2log_2n$ time.
What about total sum? The sum bits?
The carry-look ahead generator would look like this
So, it is clear that my $P_i,G_i$ inputs are available in only 1 propagation delay (only 1 level) and once, carry bits are calculated, the sum bits will also be calculated in another just one propagation delay time unit (only 1 level gates).
So, my total time to get Sum=$2log_2n+2$ time units.
So, my operation time of carry look ahead adder designed using gates of fan-in 2 is $\theta(log_2n)$
Generalization: If the fan-in of the gates is $k$, carry generation will take $2log_kn$ time where n=Number of bits in the input.
When $k=n$(Means, there is no limit on fan-in of gates), carry generation takes $2log_nn=2$ time units of the propagation delay of gates.
Hence, Total Carry Look Ahead Adder time in such case would be $2log_kn+2$ propagation delay units.
When there is no limit on fan-in, then $k=n$ and hence in that case my carry look ahead adder delay is $2log_nn+2=4$ propagation time units and this will be independent of the number of bits present in the input.