Implementing a Scripting Language

"Kevin" <>
30 Mar 2003 00:41:36 -0500

          From comp.compilers

Related articles
Implementing a Scripting Language (Kevin) (2003-03-30)
Re: Implementing a Scripting Language (Prof. Etienne M. Gagnon) (2003-03-30)
| List of all articles for this month |

From: "Kevin" <>
Newsgroups: comp.compilers
Date: 30 Mar 2003 00:41:36 -0500
Organization: Compilers Central
Keywords: interpreter, question
Posted-Date: 30 Mar 2003 00:41:36 EST

Hello all,

I am currently attempting to implement my own interpreted language
using Java to code the Interpreter. I've decided to use Java because
I'd like my language to call on various Swing components to produce
GUIs (later in the languages development).

I will explain a little about the language first, and then ask for
help afterwards! I would be extremely grateful for any assistance you
can offer me.

I have already written an extremely basic version of the Interpreter
which will take a script file and process it accordingly. The reason
my method is doomed to fail, however, is that I am using a
StringTokenizer to split the entire Script into "lines" and then
process each line accordingly.

I best highlight this with an example (using my scripting language,
should be easy enough to understand) comments start with a hash (#)
Line numbers have been added for clarity.

----------------script start------------------
1 # declare integer foo and initialise to 10
2 Declare( foo:10:int );
4 # print value of foo to output console
5 Get( foo );
7 # check the value of foo in an If constuct
8 If( foo, =, 10 )
9 {
10 ZPrint( "Foo is 10" )_
11 }
12 Else
13 {
14 ZPrint( "Foo is NOT 10" )_
15 };
--------------script end---------------------

Now, my Interpreter works fine with this simple script and performs as
expected. However, notice the end-of-line terminators used. Outside of
braces {} I use a semicolon (;) whereas within braces I use an
underscore ( _ ) to terminate the line (lines 10 and 14). This way, my
Interpreter StringTokenizes using the semicolon to split the code into
"lines" for processing. Then, this allows the If/Else construct to be
terminated with a semicolon (line 15), and as such the If/Else
construct is considered to be a single line of code and is then
processed itself with a seperate StringTokenizer which uses the
underscore as its delimiter.

As you can imagine, this is not an efficient way to write an
interpreter, and it's starting to become a nightmare to
maintain. Also, I cannot nest any deeper than 1 level without
inventing another end-of-line terminator, and then another
StringTokenizer, and so on. As such, my scripting language has reached
its limits!

I have trawled the internet searching for a site that explains the
implementation of an Interpreter using Java to no avail. I would like
to set up a parser that will tokenize the script in an efficient
manner, and then process it accordingly. I have read through some of
Jack Crenshaws excellent tutorials, but I just cannot get to grips
with Delphi and Turbo Pascal. I do consider myself to be a fairly
experienced Java programmer, and needless to say, as you are reading
this I have my head in a Pascal book trying to suss things out and
convert the code. I managed to get a magazine series that covered the
implementation of a mathematical scripting language but again it used

If I were implementing it using Java, would I need to create a Token class,
and then subclass it for all token types, i.e. AlphaCharToken, NumCharToken,
MathOpToken, etc, but then how would I evaluate functions, variables,
if/else constructs, etc, etc.

Please, please, please could someone give me a push in the right direction,
hopefully including some kind of URL pointing to an
interpreter/compiler/parser construction site where Java has been used.

Thanks for taking the time to read my rant!

Kevin W.
e-mail kevin@zazzy<nospam>

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.