Re: Terminals, non-terminal, syntax, semantics: some really naive questions

arnold@skeeve.com (Aharon Robbins)
14 Mar 2003 11:17:51 -0500

          From comp.compilers

Related articles
Terminals, non-terminal, syntax, semantics: some really naive questi rmyers1400@attbi.com (Robert Myers) (2003-03-09)
Re: Terminals, non-terminal, syntax, semantics: some really naive cfc@world.std.com (Chris F Clark) (2003-03-14)
Re: Terminals, non-terminal, syntax, semantics: some really naive arnold@skeeve.com (2003-03-14)
Re: Terminals, non-terminal, syntax, semantics: some really naive slk14@earthlink.net (SLK Parsers) (2003-03-14)
Re: Terminals, non-terminal, syntax, semantics: some really naive branco@canal13.com.br (2003-03-17)
| List of all articles for this month |
From: arnold@skeeve.com (Aharon Robbins)
Newsgroups: comp.compilers
Date: 14 Mar 2003 11:17:51 -0500
Organization: Pioneer Consulting, Ltd.
References: 03-03-039
Keywords: parse
Posted-Date: 14 Mar 2003 11:17:51 EST
Originator: arnold@skeeve.com (Aharon Robbins)

Robert Myers <rmyers1400@attbi.com> wrote:
>I find myself in the position of needing to know more about compilers
>than I thought I would ever want to know.


Compiling has several conceptual phases, as follows:


1. Syntax analysis: Do the programs follow the grammatical rules of
      the language? Consider English:


       The fat lady sang.


      This is syntactically valid, while "lady the fat sang" isn't.


2. Semantic analysis. Does the sentence have any meaning?


The airplane stirs the crockpot menacingly.


      This is syntactically valid (nouns, verbs, adverbs all in the right
      place) but semantically invalid, as it has no meaning.


3. Code generation; once a program is syntactically and semantically
      valid, it's necessary to generate code (for a compiler; for an
      interpreter, some internal representation of the program is executed
      directly).


Lexical analysis ("lexing", "scanning") is the process of grouping
input characters into "words" (known as "tokens" in the technical
jargon) of the language. For natural language, you do this process
essentially automatically, but if you've ever worked with a child
learning to read, you can see that it *is* a process.


Parsing is the syntactical analysis phase: are the incoming "words"
in the right order? Do they make sense? Parsing is done via
grammars, which specify the way sentences (statements, in programming
languages) are built up from tokens.


Terminals in grammars are the names for tokens. For example, you might
have a rule:


expression := IDENTIFIER operator IDENTIFIER SEMICOLON


Here, all the uppercase words are terminals or tokens; the scanner
returns something to the parser indicating it has seen the the
given token, which may consist of a single character, or of multiple
characters.


You then need another rule (known in the jargon as a "production") that
describes what an operator is:


operator := PLUS | MINUS | MULT | DIV | MOD


The vertical bar meaning "or".


It is conventional that terminals be spelled in upper case, non-terminals
in lower (or mixed case), but that is only convention. Non-terminals
are exactly that: they can expand into other sequences of terminals
and/or non-terminals.


At some point, the grammar must make it possible for the non-terminals
to expand into a final sequence of terminals.


I hope this helps, some. :-)
--
Aharon (Arnold) Robbins --- Pioneer Consulting Ltd. arnold@skeeve.com
P.O. Box 354 Home Phone: +972 8 979-0381 Fax: +1 928 569 9018
Nof Ayalon Cell Phone: +972 51 297-545
D.N. Shimshon 99785 ISRAEL


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.