Find Number Of Tokens and Lexemes

Question

Find Number Of Tokens and Lexemes

pC asked Oct 21, 2016 • edited Oct 21, 2016 by pC

12,908 views

See all

Show 2 previous comments

2 Answers

Best answer

// comment
printf("string %d ",++i++&&&i***a);
return(x?y:z)

I am applying whitespaces in the code to seperate tokens.

// comment
printf   (  "string %d "  ,  ++  i  ++  &&  &   i  *   *   *   a  )  ;
return   (  x  ?  y  :  z  )

Total number of tokens will be = 24

Distinct number of tokens will be = 18

How && and & are seperated ?

Compiler's Lexical Analysis phase uses greedy tokenization approach. It always matches with the biggest pattern available .

Like + is there, but ++ is also used in C. Hence, whenever i have ++, i count it as single token.

Now, i think its easy to count tokens in &&&. Like, & is used in C as Bitwise AND while && is used in C as Logical AND, so we use biggest possible pattern which is matched . Hence, &&& has 3 tokens like && and &.

We have learnt * in C. Is there ** defined in C ? No, so *** is 3 tokens .

Difference between Lexeme and a token ?

When a source program is fed into the lexical analyzer, suppose

return(x?y:z)

which yields the lexemes : return ( x ? y : z )

With corresponding tokens : <id , 0> <(> <id,1> <?> <id,3> <:> <id,z> <)>

Kapil answered Oct 25, 2016 • selected Oct 26, 2016 by pC

Kapil

See all

Related questions

0 votes

0 answers

1

rahul sharma 5 asked Dec 26, 2016

1,533 views

[Compiler Design] Distinct tokens vs Distinct lexemes
Question1. Count number of Distinct tokens and lexemes in following code { float a=20.6; float b=19.8; float c=19.8; float a=20.6; float c=15.7; printf("Hello!How r u"); printf("I am Good"); } ... same as token is just a representation of lexemes associated with it?So by this the answer to 2 and 3 statement will be true?

Question1. Count number of Distinct tokens and lexemes in following code{float a=20.6;float b=19.8;float c=19.8;float a=20.6;float c=15.7;printf("Hello!How r u");printf("...

rahul sharma 5

1.5k views

rahul sharma 5 asked Dec 26, 2016

0 votes

2 answers

2

shikharV asked Nov 19, 2015

1,961 views

Counting number of lexemes
The number of lexemes in the statement in FORTRAN DO 10 I = 100 is __________ .

The number of lexemes in the statement in FORTRANDO 10 I = 100is __________ .

shikharV

2.0k views

shikharV asked Nov 19, 2015

0 votes

0 answers

3

pC asked Jan 11, 2016

1,494 views

Find Number of Lexeme and Tokens
Find Number Of Lexeme And Tokens in the Below FORTRAN Code. [ ignore the white spaces only ] ( pls also explain me what do the code mean ) DO 10 I = 100 DO 10 I=10,1 DO 10 I = 10.1

Find Number Of Lexeme And Tokens in the Below FORTRAN Code. [ ignore the white spaces only ]( pls also explain me what do the code mean )DO 10 I = 100DO 10 I=10,1DO 10 I ...

pC

1.5k views

pC asked Jan 11, 2016

2 votes

2 answers

4

Na462 asked Nov 7, 2018

3,757 views

Number of Tokens

Na462

3.8k views

Na462 asked Nov 7, 2018

i think *** -> should be taken as 3 tokens because our lexical analyser when see * will wait for one more *..if it comes it should count it as 1 token only.....if we go this way then

++i++&&&i****a

here i think ++ is one token && is one token and ** is one token...

dont know whether m right or not..but i think this should be the way because we previoulsy give regular expressions before hand for every keyword,identifier etc to our lexical analyser..so it should match accordingly... — sudsho, Oct 21, 2016
What is rhe difference between number of tokens and number of lexemes??

What are the cases where they are not equal? — Verma Ashish, Sep 17, 2018
@kapil sir,
Q1] What are the 18 distinct tokens . Could you please be specific ?

Im getting 14 .

1.printf 2.( 3."string %d " 4., 5.++ 6.&& 7.& 8.* 9.) 10.; 11.return 12.? 13.: 14.)

Q2]' printf ' is lexeme whose token is id , ' i ' is lexeme and token is id are they both counted as single token or separate token
( printf is transilated as 'id1' and i is translated as 'id2' ) — pC, Oct 26, 2016
@pC, what about

i , x , y , z . Token will be made for these too and stored in symbol table, right ?

what is your 2nd doubt ? can u elaborate ? — Kapil, Oct 26, 2016
sir,
So printf ,i , x ,y,z are saved as id1,id2 ,id3 ,id4 and id5 ? So there are counted as different tokens . — pC, Oct 26, 2016
@kapil

can u elaborate how u counted distinct tokens and lexemes — cse23, Oct 26, 2016
@cse23

Every lexeme has a token .

I have shown lexemes by applying whitespaces in between .

*** -> 3 tokens, but for distinct - it is 1 token just like distinct concept in DBMS. — Kapil, Oct 26, 2016
@Kapil for distinct token count, printf statement open paren and close paren are considered distinct and for return statement @pc has considered only close paren to be distinct.

Won't the the close paren of printf and close paren of return same?

And the identifier a, it would be distinct too right? — Nashreen Sultana, Jan 27, 2017
I read from Ullman's book, there is no concept of distinct or duplicate tokens. I don't think that

"Distinct number of tokens will be = 18 " is a valid statement!!!

Can you mention any link for the referennce? — Manu Thakur, Sep 25, 2017
No, reference for that . I just counted the tokens for the sake of the word "distinct".

Just counting number of tokens is important here . — Kapil, Sep 25, 2017
every token recorded in symbol table is distinct
< token-name, attribute-value>
attribute value will be different for each token.

yes, counting number of tokens is important. — Manu Thakur, Sep 25, 2017
? :

Is a ternary operator, shouldn't we count it as a single operator token ? — Adeeb Abdul Salam, Jan 11, 2019
I was completely frustrated in this topic .it's a perfect explanation.And i got full marks in my mid exam. Now lexemes and token difference concept also clear.

Thank you @Kapil sir. — Aks9639, Feb 24, 2019
Q1] ** is not considered as single token because there is no nuch operator . is that the reason ? Why did you take it as separate ?

Q2] if the question would have been count the number of unique tokens . Would we count printf and i as separete ? because they fall under same tokens id . — pC, Oct 22, 2016

tags	tag:apple
author	user:martin
title	title:apple
content	content:apple
exclude	-tag:apple
force match	+apple
views	views:100
score	score:10
answers	answers:2
is accepted	isaccepted:true
is closed	isclosed:true

Find Number Of Tokens and Lexemes

Please log in or register to add a comment.

Please log in or register to answer this question.

2 Answers

Is a ternary operator, shouldn't we count it as a single operator token ?

Please log in or register to add a comment.

Please log in or register to add a comment.

Related questions

5 5 Comments reply

Please log in or register to add a comment.

Please log in or register to answer this question.

2 Answers

17 17 Comments reply

Is a ternary operator, shouldn't we count it as a single operator token ?

Please log in or register to add a comment.

3 3 Comments reply

Please log in or register to add a comment.

Related questions

5 5 Comments

17 17 Comments

3 3 Comments