Related articles |
---|
Dynamic Language (grammar) pohanl@my-deja.com (2000-07-31) |
Re: Dynamic Language (grammar) mcr@demon.co.uk (Martin Rodgers) (2000-08-04) |
Re: Dynamic Language (grammar) jimbo@radiks.net (2000-08-04) |
Re: Dynamic Language (grammar) mcr@wildcard.demon.co.uk (Martin Rodgers) (2000-08-05) |
Re: Dynamic Language (grammar) mcr@wildcard.demon.co.uk (Martin Rodgers) (2000-08-05) |
Re: Dynamic Language (grammar) mcr@wildcard.demon.co.uk (Martin Rodgers) (2000-08-10) |
Re: Dynamic Language (grammar) koontz@ariolimax.com (David G. Koontz) (2000-08-10) |
[1 later articles] |
From: | pohanl@my-deja.com |
Newsgroups: | comp.compilers |
Date: | 31 Jul 2000 20:20:06 -0400 |
Organization: | Deja.com - Before you buy. |
Keywords: | design, comment |
The future of languages.
As you all know, the history of languages started with instruction
codes for a machine. These were very basic operations like the
following...
Load 001 R1 <- put 1 in storage 1(register 1)
Load 002 R2 <- put 2 in storage 2 (register 2)
Add R1 R2 <- add storage 2 to storage 1 and put results in storage 1
Of course, this is assembly. Which is translated to
machine language..
005 001 001
005 002 002
006 001 002
(assuming 005 = Load, 006 = Add).
Since machine language are just bytes, you can convert them to
binary...
00000101 00000001 00000001
00000101 00000010 00000010
00000110 00000001 00000010
These are fed to the processor of the CPU (Central Processing Unit)
which understand instruction of Load, Add, etc. and follows
what the instructions tell it to do. Those individual bits trigger
events in the transistors. So you are actually talking to transistors
if you think low enough.
Well. If you notice the above language follows a particular syntax,
namely Operator Operand Operand. A B B
If you construct a grammar tree for this... it looks like so...
Program=>Statements
Statements=>Statement Statements
Statement=>Operator Operand Operand
Operator=>Load | Add | Sub | Jump | etc
Operand=>R1 | R2 | A1 | etc
Of course, you can create the language of Basic using a different
grammar set. C has its own, every language has one.
But have you noticed something? All the languages in the world has a
fixed grammar. The only one that comes close to being dynamic is
Lisp. But even in lisp you must follow the recursive syntax, and you
are bound to it for creating new functions.
Well, I happen to have created a new language with a dynamic grammar
tree. You can prune add grammar anywhere in it. The most scrary and
interesting thing about it is that it has the potential to be alive
"living". All it needs is a source for replacing any piece in the LHS
(left side of the grammar tree), and a source for food (something to
parse its grammar on, in computer language it is called the program).
It can obtain both either manually (you feed it), or it can grab it
from a source (like the internet webpages)
It can live on the internet following webpages. It can understand
html format (and its links). And it can understand text. So for
example, it follows a link to a regular text (which has sentences with
periods, etc), and it will eventually hit upon a http link and it can
go there if it wants. It has rudimentary english grammar capability
etc.
But back to the dynamic nature of its grammar tree. Because it is
dynamic, it can be pruned and spliced internally, new grammar trees
can be created. It can understand C, C++, pascal, etc if you feed it
that grammar.
The only things that come up is endless recursion. A bonus is that
when it has a choice of following two paths, it can use random path.
If it gets it nowwhere (not settling down to a matching tree node) in
a certain iteration, that prune of the tree is considered bad (a bad
mutation), so it is removed (it dies). It can keep track of good
paths for keeps. Eventually based on percentages, the random paths
narrow down to useful grammars.
Actions. Well what can it do? It can interpret languages and execute
if it has a way to hook into the CPU and tell it to do stuff. From
there it needs a starting point. You can feed it a program for it to
interpret (like a perl language or a C program, or a html link), and
off it goes following its intstructions. now and then it encounters a
part it doesn't understand from its food. From then it has a choice
of either incorporating the new tree token or discard it. (you can
set the mutation rate). Note that it can be set to retain a lot.
This thing can crawl to your machine and live there if it has hooks to
your machine. For example... this is a path it would take if it wants
to reproduce children...
On my windows machine it is running on an Intel cpu (it understands
this language). To migrate to another machine, it would need access
to your machine's CPU. Most computers talk http and tcp/ip. Well, if
the food is html pages, it has instant access to all the computers on
the internet that has a webservers and from there it can find ftp
servers (using ftp:// tokens). From its base machine it can ftp
itself to public ftp servers as pure executables of itself for intel
cpus. From there it has a chance to live again if someone downloads
it and runs it.
it just happens that it understand ftp commands
start=>statements
statements=>statement statements
statement=>operator file
operator=>put | get | etc
file=>[a-z.]*
There is the grammar for execution...
response=>error | ok | etc
error=>"cannot find"
ok=>"file transferred.."
etc.
a new grammar is simply an extention of things it found but
has no node to parse from in its internal grammar tree.
Because it is a living grammar, it can utilize useful languages it
parsed and incorporate that into its own grammar tree.
a=b | newgrammar
newgrammar=>(obtained from parsing food)
eventually if this part of the tree is successful elsewhere it is
retained (based on percentage, etc)
If it cannot have children, then it can just live on one machine and
basically grow and mutate itself. It can understand everthing
eventually. Even wave files (sound files) have a strict structure with
a header, begin wave sound and end file. html has <html> for beginning
and </html> for ending. C executables have Data Segment, Address
Segment, where to load it into memory, etc. C source
have main(argc, argcv) etc. So it can be in a growing mode, or
execution mode. It can execute code (any language, if you feed it the
grammar, or it finds out on its own) or run native cpu code, or it can
just grow, understand the grammar for some new food it got and create
grammar trees from it to extend itself.
Here is an interesting website: http://www.edepot.com
There is a non-living version of the grammar there
(It was useful as a glossary, so I made it live on disucssion
forum board pages, but it is non-living, so you must manually
give it new grammar by inputing into the input box.).
You can try it out on the discussion forums. A more
direct link is http://www.edepot.com/phl.html
Check out the eGlossary and add a grammar
(and then visit a discussion forum
and create a message the eGlossary can or understand)
(you may need to manually put a grammar in)
The living mode still working on. (I'm using it
as a backend as a dynamic webpage language).
[There was a vogue for extensible languages in the 1970s, where you
could stick BNF or something like it into your code and modify the
grammar of the language. They turned out not to be useful. Perhaps a
little historical research would be useful before heading further down
the same rathole. -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.