2.5k views

Which of the following features cannot be captured by context-free grammars?

1. Syntax of if-then-else statements

2. Syntax of recursive procedures

3. Whether a variable has been declared before its use

4. Variable names of arbitrary length

It will be C.

Since CFG's are used to show syntactic rules while designing compiler, and syntactic rules don't check for meaningful things such as if a variable has been declared before its use or not. Such things are meant to be handled by Semantic Analysis phase (requires power of a context sensitive grammar).

For D, a regular expression does not restrict the string length. Languages have restriction for variable name length for storing purpose like in symbol table.

For A, if then else is inherently ambiguous. But CFG can represent inherently ambiguous languages just that there are more than one parse trees possible for some strings.

edited by
+7
I Agree with this point .

But it implies that "C language" or " Fortran  language " is essentially a CSL language .
+14
See this question: https://gateoverflow.in/463/gate2008-51
To check variable is declared before use, we have to parse strings like $\left\{wcw\mid w \: \in \left(a\mid b\right)^* \right\}$, That can not be done using PDA.
+5

C is CSL or CFL ?

(For those who are wondering like me.)

0
Can you elaborate option d? How can CFG capture variable length of arbitary  name?
0
for option B, some examples for showing recursive style of cfg are...
E->E+E/E-E/E*E/(E) ,  E->id    or    S->aSb/bSa/SS/ε   or S->aSa/bSb/a/b/ε etc
0
For  this lang dpda available

{wcw∣w∈(a∣b)∗}. So why this "whether a variable has been declared before it's use " cannot be captured by context free grammar
0

Variable names of arbitrary length

Will lexical analyzer phase detect it as....too much long identfiers ?

+1

@jatin khachane 1 yes. Different compilers are there for different languages. Each one has its own set of rules apart from what we define in lex file. As compilers are different or 32/64 bit machine, it even has the capacity to detect maximum possible integer value that can be assigned to the variable, length of variable etc..

EDIT: Even there is a question what is the effect of using long identifiers in lexical analysis part. Answer to that is "slow compilation".

c will be answer bcz the limitation of syntax analyzer are

• it cannot determine if a token is valid,(that will be take care of by lexical nalyzer and valid token are identifers, special symbols, alphabets these are to be consider valid tokens)
• it cannot determine if a token is declared before it is being used,
• it cannot determine if a token is initialized before it is being used,
• it cannot determine if an operation performed on a token type is valid or not   so for that to handle these we go for semantic analysis phase
edited
0
@Arjun @rajan

They are asking what cant be designed using CFG. Ofcourse we use CFG for option (c).

I doubt about option A and D.

For A, which i inherently ambigious, I think Ullman book has procedure for making it unambigious.

I dont know how can you detect multi-length variable i.e (  ATR -> aS where ATR is single variable.)
+1
How we use CFG for C?

For D, a regular expression does not restrict the string length. Languages have restriction for variable name length for storing purpose like in symbol table.

For A, yes, if then else is inherently ambiguous. But CFG can represent inherently ambiguous languages just that there are more than one parse trees possible for some strings.
0
For C, even though we use semantic rules, they are auxillay to CFG. So, base is parsing for semantic rules. So, thats possible.

For D, we use lexical analysis. @Arjun, I didnt get your point here.

For A, if you have more than 1 parse tree means you interpreted the syntax in more than 1 way which means you failed.
0
Can anyone explain option B ?
0

i think ww can be use for c where w is (a+b)*

0

if then else is inherently ambiguous???? doubt

@Arjun sir according to ULMAN we have an unambiguous grammar for dangling if then else , so how is it inherently ambiguous language?

In all programming languages with if-then-else statements of this form, the second parse tree is preferred. Hence the general rule is: match each else with the previous closest then. This disambiguating rule can be incorporated directly into a grammar by using the following observations.

• A statement appearing between a then and a else must be matched. (Otherwise there will be an ambiguity.)
• Thus statements must split into kinds: matched and unmatched.
• matched statement is
• either an if-then-else statement containing no unmatched statements
• or any statement which is not an if-then-else statement and not an if-then statement.
• Then an unmatched statement is
• an if-then statement (with no else-part)
• an if-then-else statement where unmatched statements are allowed in the else-part (but not in the then-part).

 stmt matched-stmt | unmatched-stmt matched-stmt if expr then matched-stmt else matched-stmt matched-stmt non-alternative-stmt unmatched-stmt if expr then stmt unmatched-stmt if expr then matched-stmt else unmatched-stmt

sir resolve the doubt

1
2