Related articles |
---|
osimpa r@cLIeNUX.com (2001-07-27) |
From: | r@cLIeNUX.com (cLIeNUX user) |
Newsgroups: | comp.compilers |
Date: | 27 Jul 2001 02:58:42 -0400 |
Organization: | Posted via Supernews, http://www.supernews.com |
Keywords: | available |
Posted-Date: | 27 Jul 2001 02:58:42 EDT |
osimpa
PAGE DATE
july 2001
overview
osimpa is a set of assembly macros resembling a compiler for a generic
one-stack computer. The name is representative of "one stack in memory
plus accumulator", although that's not quite enough machine model for
a practical system. Osimpa is based on my asmacs assembly macros, and
the version of osimpa documented here is on top of shasm, my infamous
386 assembler in GNU Bash. osimpa however should be very easy to
implement on top of other proper assemblers, using something like m4.
osimpa tries to bring some of the simplicity and elegance of Forth to
native one-stack code. As such, it emphasizes a virtual machine as
it's conceptual continuity*, rather than a "language" or syntax. There
are some "language constructs" though, which is how osimpa isn't just
an assembler. It's language-ness is very bottom-up though. It's still
a set of features within an assembler.
features
Some of osimpa resembles the predecessors of C such as BCPL, or a
featureful assembler like NASM. osimpa is not typed. There is a cell
concept, the size of a native machine pointer, but that's just a size,
not a type in the C sense. A C-like enumerate is supported, the clump
facility is a rustic form of C structs, and there are strands, which
are a very general type of array. Strands can also serve as strings,
masked-index rings, and other goodies. Host functionality is available
via a sys macro directly analagous to the sys in BCPL and similar.
calling convention
The facilities for defining and handling osimpa subroutine calls are
somewhat distinctive. The assembler state is informed that a new
routine is being assembled with the def macro, as follows
[not real code]
def newname cells
[possible real code]
def strandtoken 4
The 4 tells osimpa the size of new routine strandtoken's stack frame
in cells, and the name strandtoken is affiliatied with the current
assembly address. strandtoken remains the current subroutine being
defined until the next def. A routine can be ended with a fed, or not.
When a fed does occur, a stack-frame-rewinding return is assembled.
There may be 0, 1 or more feds, but most routines will have one.
A routine assembled this way is properly called with hcall, for
"hiking call".
hcall strandtoken
assembles a caller-hikes preamble to the actual call, and thus we have
stack frame maintenance requiring one extra instruction over a machine
call instruction.
We haven't passed strandtoken any parameters. We gave it a 4-cell
stack frame, but there's currently no valid data in it, just leftover
bits. This is something I suspect may be fairly unique about osimpa.
It is caller-hikes, callee-passes. This creates copy-on-write
parameter passing, and is the result of osimpa's means of dealing with
routine stack frames as three levels of local variable. osimpa
provides macros to refer to cells in the parent, current, and most
recently exited child routines as pre-named local variables. The
current routine's locals are a, b, c..., the parent routine's locals
can be referenced from the current routine as pa, pb, pc..., and
similarly for the last child routine the current routine hcalled, ala
ca, cb, cc, cd...
The loss of naming flexibility isn't so bad for locals, which often
get terse names anyway, and the flexibility created is notable.
Accessing the locals of a child routine is a form of multiple return
value. Accessing the locals of a parent routine can involve moving
thier value to the A register, i.e. the accumulator, or not. If a
parent value doesn't need to actually be moved into the current
routine's frame there is a performance benefit. Values to be passed
from parent to child through current must be moved though. Also, any
naming annoyances in osimpa should be easily offset by the fact that
it's just a script, and can be seasoned to taste in seconds.
All this talk about the internals of the language at this early point
is a bit abnormal, but you wind up having to know all this nonsense to
use something like C well anyway. I feel that attempts to abstract
these things out of systems programming languages is a noble failure.
Forth (on conventional hardware) gives you a simple virtual machine to
deal with as directly as possible. osimpa concentrates on giving you a
subset of your actual machine as directly as possible, hopefully with
some portability side-benefits, while leaving all the specifics of
your actual machine available right there in the rest of the assembler
osimpa is implemented in/on.
use
osimpa features are at all times merely optional addenda to the
regular assembler, shasm in this case. Thus you have all the usual
data declaration facilities and directives of assembly before osimpa
enters the picture. Then you want to get fancy. Like shasm, every
osimpa command is a shell command. If you use osimpa/shasm sourced
into your shell state, i.e. interactively, you can for example do
[your_shell_prompt]binary h
Convert the binary representation of a number to decimal. Accepts
it's single argument in multiple segments, like for bytes in a quad.
binary 0101010011010011
Bash math is 32 bit.
As in Forth, interactivity pays. Try things. Define something and see
what happens. You can
grep "()" osimpa
for a list of all osimpa routines, as with shasm. A clump doesn't
assemble anything, it just alters assembler state. That also happens
to be your shell's variables, so you can
set | grep clumpname
to see what got defined and then
echo $whatever
to see what it got defined to. There's also reading the script itself.
strands
strands embody several ideas I've wanted to pursue for array-like data
structures that bear some discussion independant of thier osimpa-ness.
Strands implement nested metadata. If you have a simple data structure
like a count-cell-prefixed "string", and you add metadata in an
organized way, you have certain generality and reuseability. This is
the reason for the strand header format. Code that wants to deal with
strings needs to know about the first metadata cell before the string.
If code knows about the next preceding metadata cell, the "ply" cell,
it can treat the strand in question like an array of ply N. Or a
string. If the metadata cells are in a consistant format there is only
one data structure needed for a wide variety of uses. The fields
already decided upon in a strand prefix are, proceding toward lower
addresses from the nominal address of the strand instance; bytesize,
ply, mask. "mask" is for ring buffers. Other possible cell
reservations would be indexes, pointers to affiliated strands, and so
on, which would allow stacks, dequeues, list elements, and so on.
osimpa cpp/asm macros have been posted elsewhere. The shasm version isn't
out yet, but shasm itself is in
ftp://ftp.gwdg.de/pub/linux/install/clienux/interim
Rick Hohensee
www.clienux.com
Return to the
comp.compilers page.
Search the
comp.compilers archives again.