Related articles |
---|
Oilexer Early Release alexander.morou@gmail.com (Alexander Morou) (2015-07-12) |
From: | Alexander Morou <alexander.morou@gmail.com> |
Newsgroups: | comp.compilers |
Date: | Sun, 12 Jul 2015 14:31:45 -0500 |
Organization: | Compilers Central |
Keywords: | lex, available |
Posted-Date: | 13 Jul 2015 23:43:37 EDT |
I've posted a very early release of Oilexer on Codeplex:
https://oilexer.codeplex.com/releases/view/616236
This release is very early, so there is no error recovery details present,
it is capable of detecting failure points, but I just haven't made up
my mind on the specific error recovery strategy I'm going to use.
It exports to C# language in the form of multiple .cs files, requires
no library dependencies. So if OILexer completes its processing on
a grammar, and you instructed it to export C# files, it should just
compile by: creating a new C# Project in Visual Studio, dragging the files
*onto a node* of the solution explorer for that project and building should
be all you need to do (and adding a little code to specify a file to parse)
The sample OILexer grammar would be built thusly, from a command prompt
in the folder you extract it to:
OILexer.exe "Samples\Oilexer\Oilexer.oilexer" -ex:cs
It provides two things once you call a specific ParseRULENAMEHERE method:
1. The AST node of the parse method you called.
a. The AST serves to provide you access to the items you captured in
the grammar. If you don't specify any captures, all it points to
is the context.
2. The AST node always points to the context, or the Concrete set of
symbols represented by that parse. This is the fluff and other stuff you
need to make it less ambiguous.
The approach is LL(*) with support for Direct and Indirect left recursion
through the use of a symbol stream (vs a standard token stream only.)
There are a few known issues:
1. Follow ambiguities which consume required calling rule tokens within
a reduction of a prediction have a chance to guess wrong and consume
too greedily, this will cause a false positive parse failure on valid
sentences of a grammar. This will be tackled after Error Recovery.
2. Certain heavily intertwined left recursive sets of rules might exit
prematurely because the stack sniffing I currently use is overly
cautious, causing it to bail. This is the focus after #1.
3. The #Root and other preprocessor constants observed in the samples
appear to be required to a degree as they can potentially yield
bad paths on the output, I suspect this is an easy fix, simple
solution for now is to start from a sample.
4. Heavily left-recursive rule sets that go 20+ levels in their definition
can yield poor parse time for heavily nested sentences.
5. A lot of things are likely buggy and incomplete, I welcome any and all
feedback.
Return to the
comp.compilers page.
Search the
comp.compilers archives again.