RE: Visual Parse++ purchasing problems

Quinn Tyler Jackson <quinn-j@shaw.ca>
30 Apr 2005 15:41:42 -0400

          From comp.compilers

Related articles
Re: Visual Parse++ purchasing problems cfc@shell01.TheWorld.com (Chris F Clark) (2005-04-28)
RE: Visual Parse++ purchasing problems quinn-j@shaw.ca (Quinn Tyler Jackson) (2005-04-30)
| List of all articles for this month |

From: Quinn Tyler Jackson <quinn-j@shaw.ca>
Newsgroups: comp.compilers
Date: 30 Apr 2005 15:41:42 -0400
Organization: Compilers Central
References: 05-04-090
Keywords: parse, practice
Posted-Date: 30 Apr 2005 15:41:42 EDT

> > I am using Visual Parse++ .... Is there compatible parser that
> I can use?
>
> There are three potentially compatible parsers that I am aware of
> which you might try (that are likely to have C# support):
> DMS
> Yacc++
> GrammarForge (previously meta-S)
> ANTLR
>
> As to Yacc++, the C# support just recently emerged from alpha-test and
> has not yet entered beta-test.
>
> > All my attempts to contact them were unsuccessful.
>
> As to the business end, the parser generator business is not high
> revenue. As a result, most companies are quite small (perhaps not
> even having any full-time staff) or have a different business that
> they are actually in and the parser generator work is merely a way to
> leverage an asset that they happen to have (perhaps to support their
> "real" work). The downside, as you have noticed, is that many such
> companies are very hard to contact. The upside is that if you can get
> in touch with them, you are probably not that far removed from the
> developer(s), and if you need something fixed, the person who will fix
> it will probably see your problem first hand.


Hello, Chris. For some reason, I still can't send you private emails
-- they always bounce. That said, however, my response to your post in
comp.compilers is of a general enough nature that I'll post it
publicly.


Yes, the Grammar Forge does support C# and the .NET languages in general,
really, through COM interoperability. An example of this is here:


>> BEGIN C# SLICE


public bool LuaDemo()
{
      MetaXGrammar gmr = new MetaXGrammar();
      gmr.LuaCallbacks = new LuaEvents(Console.Out, Console.Error);
      gmr.SetLuaDebug(true, false, false);


      gmr.Grammar =
                  "grammar LuaTest host Lua {" +
                  " S ::= a b c;" +
                  " a ::= '[0-9]+';" +
                  " b ::= '[a-z]+';" +
                  " c ::= '[0-9]+';" +


                  " @function_impls = :{" +
                        " the_string = '';\n" +
                        " function c_event (N)\n" +
                        " if (N:MATCHED()) then\n" +
                        " the_string=N:LEXEME();\n"+
                        " print('Lua variable: ' .. the_string);\n" +
                        " end\n"+
                        " end\n" +
                        " }:;" +
                  "};";


      if(gmr.IsGood)
      {
                  Console.Write("Grammar \"" + gmr.GrammarName + "\" compiled!\n\n");


                  IMetaXScanInfo r = gmr.Scan("12345 abc 67890 123abc456");


                  if(r.MatchLength > 0)
                  {
                        if(gmr.get_LuaVariable("the_string") == "456")
                        {
                              Console.Write("Lexeme = \"" + r.Lexeme + "\"\n");
                              Console.Write(gmr.Root.get_ChildByPath("S/b").Lexeme + "\n");


                              IMetaXMatcher m = gmr.get_Matcher("b");


                              if(m != null)
                              {
                                          Console.Write(m.RuleName + " = " + m.ExpandedRule +
"\n");


                                          return true;
                              }
                        }
                  }
      }
      else
      {
                  Console.Write("Error: Grammar could not be compiled!\n");
      }


      return false;
}


<< END C# SLICE


That's actually more than just a C# interop example -- the $-grammar
also has an embedded Lua semantic action that prints out (to the C#
console) whatever matches the production c, and C# code that fetches
whatever was placed into the Lua engines "b" variable.


Also of note is the fact that the grammars themselves are compiled
into intermediate form at run-time, and can therefore be loaded from
resources, files on disk, or whatever.


One final note about the above example -- the engine can "accept" or
"scan". (I use "scan" as distinct from "accept" to mean, "find as a
subset within the input, rather than necessarily accept the whole
input"). This is what makes the grammar like a giant pattern finder,
as opposed to just an accepting parser.


(Some have asked about this capability before -- it's not quite the
same as just putting .* in the first production of a grammar, as it
has its own properties.)


And now for the "business end".


Yes, it has been my experience that the business end of parser
generators is influenced by certain common characteristics:


1. Academic strength parser generators tend to be handed off after the
academics who develop them no longer have a pressing academic
need. They are driven by how much publication can be squeezed from
them. They are often built upon whatever host language was convenient
at the time of development -- convenient in the sense that certain
host languages may not produce the fastest engines, but allow for
quick development; but in an academic setting, speed of parsing may
not be the most important driving factor.


2. Commercial strength parser generators tend to be subsidized by
contracting work rather than purely by sales of the engine. This ties
up the generator's developers in other pressing obligations, such that
they are able to maintain an income, and the generator itself is more
of a side-effect of the whole endeavor.


3. Support for parser generators can be highly specialized. People
with the technical expertise to support a C# grammar on demand are few
and far between, and can get very busy.


4. One-fee-licensing does not generate sufficient revenue to support
product development. Without other revenue streams, the main
technology cannot move forward.


This has resulted in a situation, in my opinion, where theoretically
superior technologies which use advanced language concepts never make
it into commercial strength development. People know that
Turing-powerful formalisms exist, but very little research is done in
that field that ends up being harnessed in commercial
products. Advancements are co-opted in a somewhat ad hoc fashion into
technologies, without sufficient theoretical analysis of the impact of
these advancements on a given formalism. For instance -- what is the
formal power of an LALR(k) parser with non-length-increasing
predicates? Has it been formally proven? What is the average case time
complexity of such enhancements? Under which situations do such
enhancements break the verifiability of LALR(k)? What is the formal
machine model employed to effect a parse against such grammars?


In my own research, I have tried to avoid announcing features that I
have not proven in some way to have justifiable consquences. There are
still features of $-grammars that are unproven (scoping being one of
them). They greatly assist in parsing and in writing compact
$-grammars for such things as C++'s enums, but they have unknown
issues to date, and so are not formally discussed. But these features
are not necessary or "lynchpin" features -- anyone wishing to use such
features gets the power -- with a bit of uncertainty about their
formal soundness until I prove their formal soundness and impact.


Parsing is considered by many to be a "solved problem" in many ways. I
believe John Levine would tend to agree that tends to be the general
concensus. New theory comes out still, but in many areas, that new
theory has horrendous time complexity implications. Efficiency
considerations are paramount in commercial strength engines -- O(n^6)
in the average case is totally unacceptable, whereas O(n^m) for x < 3
is probably acceptable, if that is constrained to a small subset of
productions for some reasonable length of input.


But the marriage of new advancements AND commercially driven needs is
difficult. Speed and memory are still big issues to commercial
interests -- so people are often best served (at least in their
analysis) by proven, but older models with some ad hockery in the
reduction code. There are a large body of non-theorists who know how
to write such parsers, and so, they are not tied into specific vendors
who may or may not have the resources to respond to specific
needs. After all -- if it can't be handled in the grammar -- handle it
in the code. There are far more generalists than specialists. What
this results in, however, is a situation where fancy cars have old
engines, and fancy, better engines sometimes are sitting in the garage
without cars around them.


All of that said -- although in a recent post I mentioned that the
Grammar Forge has been acquired by Thothic Technology Partners, LLC, I
will add that I have not abandoned parsing or grammar theory. The
particulars of the sale agreement are under seal, but I can say that I
am free to pursue academic research in the field of the
$-Calculus. This obviously is for the best, since it leaves me free to
pursue theoretical and technological advancements that may benefit
meta-s theory and also the tool.


In just this vein, I am currently summarizing the body of knowledge I
have about meta-s advancements in the form of a higher doctorate
dissertation. Free of business considerations, I can further the
theory in a methodological fashion. Also, it is definitely one way to
encourage me to write that book about parsing I've been saying I'm
going to write for 6 years now. ;-)


--
Chev. Quinn Tyler Jackson
Computer Scientist, Novelist, Poet


http://members.shaw.ca/qjackson/



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.