|Terminals, non-terminal, syntax, semantics: some really naive questi email@example.com (Robert Myers) (2003-03-09)|
|Re: Terminals, non-terminal, syntax, semantics: some really naive firstname.lastname@example.org (Chris F Clark) (2003-03-14)|
|Re: Terminals, non-terminal, syntax, semantics: some really naive email@example.com (2003-03-14)|
|Re: Terminals, non-terminal, syntax, semantics: some really naive firstname.lastname@example.org (SLK Parsers) (2003-03-14)|
|Re: Terminals, non-terminal, syntax, semantics: some really naive email@example.com (2003-03-17)|
|From:||firstname.lastname@example.org (Aharon Robbins)|
|Date:||14 Mar 2003 11:17:51 -0500|
|Organization:||Pioneer Consulting, Ltd.|
|Posted-Date:||14 Mar 2003 11:17:51 EST|
|Originator:||email@example.com (Aharon Robbins)|
Robert Myers <firstname.lastname@example.org> wrote:
>I find myself in the position of needing to know more about compilers
>than I thought I would ever want to know.
Compiling has several conceptual phases, as follows:
1. Syntax analysis: Do the programs follow the grammatical rules of
the language? Consider English:
The fat lady sang.
This is syntactically valid, while "lady the fat sang" isn't.
2. Semantic analysis. Does the sentence have any meaning?
The airplane stirs the crockpot menacingly.
This is syntactically valid (nouns, verbs, adverbs all in the right
place) but semantically invalid, as it has no meaning.
3. Code generation; once a program is syntactically and semantically
valid, it's necessary to generate code (for a compiler; for an
interpreter, some internal representation of the program is executed
Lexical analysis ("lexing", "scanning") is the process of grouping
input characters into "words" (known as "tokens" in the technical
jargon) of the language. For natural language, you do this process
essentially automatically, but if you've ever worked with a child
learning to read, you can see that it *is* a process.
Parsing is the syntactical analysis phase: are the incoming "words"
in the right order? Do they make sense? Parsing is done via
grammars, which specify the way sentences (statements, in programming
languages) are built up from tokens.
Terminals in grammars are the names for tokens. For example, you might
have a rule:
expression := IDENTIFIER operator IDENTIFIER SEMICOLON
Here, all the uppercase words are terminals or tokens; the scanner
returns something to the parser indicating it has seen the the
given token, which may consist of a single character, or of multiple
You then need another rule (known in the jargon as a "production") that
describes what an operator is:
operator := PLUS | MINUS | MULT | DIV | MOD
The vertical bar meaning "or".
It is conventional that terminals be spelled in upper case, non-terminals
in lower (or mixed case), but that is only convention. Non-terminals
are exactly that: they can expand into other sequences of terminals
At some point, the grammar must make it possible for the non-terminals
to expand into a final sequence of terminals.
I hope this helps, some. :-)
Aharon (Arnold) Robbins --- Pioneer Consulting Ltd. email@example.com
P.O. Box 354 Home Phone: +972 8 979-0381 Fax: +1 928 569 9018
Nof Ayalon Cell Phone: +972 51 297-545
D.N. Shimshon 99785 ISRAEL
Return to the
Search the comp.compilers archives again.