In the programming languages, the “correctness” of a program is in terms of “syntax” as well as “semantics”.
- Syntax: the form or structure of the expressions, statements, and program units
- Semantics: the meaning of the expressions, statements, and program units
The syntax related features are captured by Context Free Grammars i.e. Context-free grammars are often used to define the syntax of programming languages.
There is a level of correctness deeper than syntax (grammar).
https://www.cs.mcgill.ca/~cs520/2021/slides/9-semantic-name.pdf
https://www.inf.ed.ac.uk/teaching/courses/ct/17-18/slides/8-semantic.pdf
All these errors in the above picture are “Semantic errors”. None of them can be captured using CFGs.
Using CFG, we cannot capture some Context-sensitive aspects of the syntax of a language, such as checking that an item has been declared before use and that the use of the item is consistent with its declaration.
http://homepage.divms.uiowa.edu/~slonnegr/plf/Book/Chapter3.pdf
$\color{red}{\text{The following Errors can Not be captured using CFGs:}}$
About names:
1. Is $x$ a scalar, an array or a function?
2. Is $x$ declared? Are there names declared but not used?
3. Which declaration of $x$ does each use reference?
About types:
4. Is the expression $x*y+z$ type-consistent? (Type Checking)
5. In $a[ i , j ,k]$, does $’a’$ have three dimensions?
6. How many arguments does a function take? What about printf ?
About memory:
7. Where can $z$ be stored? (register, local, global heap, static)
8. Does $*p$ reference the result of a malloc()?
9. Do $p$ and $q$ refer to the same memory location?
10. Is an array access out of bound?
11. Where should a variable be stored (head, stack,…)
https://www.cs.purdue.edu/homes/xyzhang/spring11/notes/typecheck.pdf
https://www.ida.liu.se/~TDDB44/laboratories/instructions/lab4.html
Types and Declarations: Done in Semantic Analysis.
A large part of semantic analysis consists of tracking variable/function/type declarations and type checking.
In many languages, identifiers have to be declared before they’re used. As the compiler encounters a new declaration, it records the type information assigned to that identifier. Then, as it continues examining the rest of the program, it verifies that the type of an identifier is respected in terms of the operations being performed. For example, the type of the right side expression of an assignment statement should match the type of the left side, and the left side needs to be a properly declared and assignable identifier. The parameters of a function should match the arguments of a function call in both number and type. The language may require that identifiers be unique, thereby forbidding two global declarations from sharing the same name. Arithmetic operands will need to be of numeric—perhaps even the exact same type (no automatic inttodouble conversion, for instance). These are examples of the things checked in the semantic analysis phase.
https://web.stanford.edu/class/archive/cs/cs143/cs143.1128/handouts/180%20Semantic%20Analysis.pdf
Static Semantic Errors:
https://stackoverflow.com/questions/40430578/static-semantics-meaning
https://www.d.umn.edu/~rmaclin/cs5641/Notes/Lecture12.pdf
https://personal.utdallas.edu/~cid021000/CS-4337_13F/slides/CS-4337_03_Chapter3.pdf
Classes of Errors:
https://www.d.umn.edu/~rmaclin/cs5641/Notes/Lecture12.pdf
A context-free grammar is a set of recursive rewriting rules (or productions) used to generate patterns of strings. Context-free grammars are often used to define the syntax of programming languages.
https://www.cs.rochester.edu/u/nelson/courses/csc_173/grammars/