// comment
printf("string %d ",++i++&&&i***a);
return(x?y:z)
I am applying whitespaces in the code to seperate tokens.
// comment
printf ( "string %d " , ++ i ++ && & i * * * a ) ;
return ( x ? y : z )
Total number of tokens will be = 24
Distinct number of tokens will be = 18
How && and & are seperated ?
Compiler's Lexical Analysis phase uses greedy tokenization approach. It always matches with the biggest pattern available .
Like + is there, but ++ is also used in C. Hence, whenever i have ++, i count it as single token.
Now, i think its easy to count tokens in &&&. Like, & is used in C as Bitwise AND while && is used in C as Logical AND, so we use biggest possible pattern which is matched . Hence, &&& has 3 tokens like && and &.
We have learnt * in C. Is there ** defined in C ? No, so *** is 3 tokens .
Difference between Lexeme and a token ?
When a source program is fed into the lexical analyzer, suppose
return(x?y:z)
which yields the lexemes : return ( x ? y : z )
With corresponding tokens : <id , 0> <(> <id,1> <?> <id,3> <:> <id,z> <)>