r@cLIeNUX.com (cLIeNUX user)
27 Jul 2001 02:58:42 -0400

          From comp.compilers

Related articles
osimpa r@cLIeNUX.com (2001-07-27)
| List of all articles for this month |

From: r@cLIeNUX.com (cLIeNUX user)
Newsgroups: comp.compilers
Date: 27 Jul 2001 02:58:42 -0400
Organization: Posted via Supernews, http://www.supernews.com
Keywords: available
Posted-Date: 27 Jul 2001 02:58:42 EDT



      july 2001


      osimpa is a set of assembly macros resembling a compiler for a generic
      one-stack computer. The name is representative of "one stack in memory
      plus accumulator", although that's not quite enough machine model for
      a practical system. Osimpa is based on my asmacs assembly macros, and
      the version of osimpa documented here is on top of shasm, my infamous
      386 assembler in GNU Bash. osimpa however should be very easy to
      implement on top of other proper assemblers, using something like m4.

      osimpa tries to bring some of the simplicity and elegance of Forth to
      native one-stack code. As such, it emphasizes a virtual machine as
      it's conceptual continuity*, rather than a "language" or syntax. There
      are some "language constructs" though, which is how osimpa isn't just
      an assembler. It's language-ness is very bottom-up though. It's still
      a set of features within an assembler.


      Some of osimpa resembles the predecessors of C such as BCPL, or a
      featureful assembler like NASM. osimpa is not typed. There is a cell
      concept, the size of a native machine pointer, but that's just a size,
      not a type in the C sense. A C-like enumerate is supported, the clump
      facility is a rustic form of C structs, and there are strands, which
      are a very general type of array. Strands can also serve as strings,
      masked-index rings, and other goodies. Host functionality is available
      via a sys macro directly analagous to the sys in BCPL and similar.

        calling convention

      The facilities for defining and handling osimpa subroutine calls are
      somewhat distinctive. The assembler state is informed that a new
      routine is being assembled with the def macro, as follows
                                                                                [not real code]
                def newname cells
                                                                                [possible real code]
                def strandtoken 4

      The 4 tells osimpa the size of new routine strandtoken's stack frame
      in cells, and the name strandtoken is affiliatied with the current
      assembly address. strandtoken remains the current subroutine being
      defined until the next def. A routine can be ended with a fed, or not.
      When a fed does occur, a stack-frame-rewinding return is assembled.
      There may be 0, 1 or more feds, but most routines will have one.

      A routine assembled this way is properly called with hcall, for
      "hiking call".

                hcall strandtoken

      assembles a caller-hikes preamble to the actual call, and thus we have
      stack frame maintenance requiring one extra instruction over a machine
      call instruction.

      We haven't passed strandtoken any parameters. We gave it a 4-cell
      stack frame, but there's currently no valid data in it, just leftover
      bits. This is something I suspect may be fairly unique about osimpa.
      It is caller-hikes, callee-passes. This creates copy-on-write
      parameter passing, and is the result of osimpa's means of dealing with
      routine stack frames as three levels of local variable. osimpa
      provides macros to refer to cells in the parent, current, and most
      recently exited child routines as pre-named local variables. The
      current routine's locals are a, b, c..., the parent routine's locals
      can be referenced from the current routine as pa, pb, pc..., and
      similarly for the last child routine the current routine hcalled, ala
      ca, cb, cc, cd...

      The loss of naming flexibility isn't so bad for locals, which often
      get terse names anyway, and the flexibility created is notable.
      Accessing the locals of a child routine is a form of multiple return
      value. Accessing the locals of a parent routine can involve moving
      thier value to the A register, i.e. the accumulator, or not. If a
      parent value doesn't need to actually be moved into the current
      routine's frame there is a performance benefit. Values to be passed
      from parent to child through current must be moved though. Also, any
      naming annoyances in osimpa should be easily offset by the fact that
      it's just a script, and can be seasoned to taste in seconds.

      All this talk about the internals of the language at this early point
      is a bit abnormal, but you wind up having to know all this nonsense to
      use something like C well anyway. I feel that attempts to abstract
      these things out of systems programming languages is a noble failure.
      Forth (on conventional hardware) gives you a simple virtual machine to
      deal with as directly as possible. osimpa concentrates on giving you a
      subset of your actual machine as directly as possible, hopefully with
      some portability side-benefits, while leaving all the specifics of
      your actual machine available right there in the rest of the assembler
      osimpa is implemented in/on.


      osimpa features are at all times merely optional addenda to the
      regular assembler, shasm in this case. Thus you have all the usual
      data declaration facilities and directives of assembly before osimpa
      enters the picture. Then you want to get fancy. Like shasm, every
      osimpa command is a shell command. If you use osimpa/shasm sourced
      into your shell state, i.e. interactively, you can for example do

                [your_shell_prompt]binary h

Convert the binary representation of a number to decimal. Accepts
it's single argument in multiple segments, like for bytes in a quad.

                                                binary 0101010011010011

Bash math is 32 bit.

      As in Forth, interactivity pays. Try things. Define something and see
      what happens. You can

                grep "()" osimpa

      for a list of all osimpa routines, as with shasm. A clump doesn't
      assemble anything, it just alters assembler state. That also happens
      to be your shell's variables, so you can

                set | grep clumpname

      to see what got defined and then

                  echo $whatever

      to see what it got defined to. There's also reading the script itself.


      strands embody several ideas I've wanted to pursue for array-like data
      structures that bear some discussion independant of thier osimpa-ness.
      Strands implement nested metadata. If you have a simple data structure
      like a count-cell-prefixed "string", and you add metadata in an
      organized way, you have certain generality and reuseability. This is
      the reason for the strand header format. Code that wants to deal with
      strings needs to know about the first metadata cell before the string.
      If code knows about the next preceding metadata cell, the "ply" cell,
      it can treat the strand in question like an array of ply N. Or a
      string. If the metadata cells are in a consistant format there is only
      one data structure needed for a wide variety of uses. The fields
      already decided upon in a strand prefix are, proceding toward lower
      addresses from the nominal address of the strand instance; bytesize,
      ply, mask. "mask" is for ring buffers. Other possible cell
      reservations would be indexes, pointers to affiliated strands, and so
      on, which would allow stacks, dequeues, list elements, and so on.

osimpa cpp/asm macros have been posted elsewhere. The shasm version isn't
out yet, but shasm itself is in


Rick Hohensee

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.