Re: Simple Language design (farmersckn)
17 Apr 2002 23:19:22 -0400

          From comp.compilers

Related articles
Simple Language design (2002-04-06)
Re: Simple Language design (Kamal R. Prasad) (2002-04-07)
Re: Simple Language design (2002-04-17)
| List of all articles for this month |

From: (farmersckn)
Newsgroups: comp.compilers
Date: 17 Apr 2002 23:19:22 -0400
References: 02-04-023
Keywords: lex
Posted-Date: 17 Apr 2002 23:19:22 EDT

Lexical analysis is simply taking the raw character input and
seperating that input into tokens (i.e. keywords, numbers,
identifiers, and symbols (like ==))

Create a few functions to say if a character is a letter, a digit, or
a symbol. You might create a 256 entry array that says if the ascii
code at that index is a number, letter, or symbol. Then create a loop
that takes a character, ids it as a letter, digit or symbol, and then
"builds" a token based on what kind of character it is:

char c
string s
while not eof (inputstream)
  c = getnextchar (inputstream)
  if isletter(c) then
    s = getname (inputstream)
    if iskeyword(s) then
      addtokentolist(s, KEYWORD)
      addtokentolist(s, IDENTIFIER)
    end if
  else if isnumber(c) then
    s = getnumber (inputstream)
    addtokentolist(s, NUMBER)
  else if issymbol (c) then
    s = getsymbol (inputstream)
    addtokentolist(s, SYMBOL)
  end if
end while

I would HIGHLY recommend that you read "Let's build a compiler!" by
Jack Crenshaw. Its online and free, and packed with useful

Hopefully that gives you some direction. (Chubby) wrote
> I'm currently developing a very simple language which describes HTML
> forms in simple text. I'm using JAVA to implement the
> compiler/translator and just need to know the general STEPS needed for
> lexical analysis. I've read tons of books which describe how simple
> mathmatical expressions can be tokenized but what about the more
> complicated strings, keywords etc?
> Also what would be the best way to read the source file into the
> program. ...

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.