Was: Parallel compilers ... DMS and PARLANSE as parallel compiler

"Ira Baxter" <idbaxter@semdesigns.com>
Sat, 5 Sep 2009 14:14:57 -0500

          From comp.compilers

Related articles
Parallel compilers darkbard@gmail.com (Gabriele Farina) (2009-09-04)
Was: Parallel compilers ... DMS and PARLANSE as parallel compiler idbaxter@semdesigns.com (Ira Baxter) (2009-09-05)
| List of all articles for this month |
From: "Ira Baxter" <idbaxter@semdesigns.com>
Newsgroups: comp.compilers
Date: Sat, 5 Sep 2009 14:14:57 -0500
Organization: Compilers Central
References: 09-09-032
Keywords: parallel
Posted-Date: 06 Sep 2009 02:12:30 EDT

"Gabriele Farina" <darkbard@gmail.com> wrote in message


> I'd like to know if someone has some information to give me about
> implementing a compiler that's able to split the compilation process
> in parallel steps, to fully take advantage or
> multicore/multiprocessor machines.


> I don't have any knownledge about that and I didn't find any
> interesting lecture, so any advice would be really appreciated.


The DMS Software Reengineering Toolkit is a of general purpose program
transformation system used to analyze/transform programs. The entire
system is coded in a parallel programming language,
PARLANSE. (http://www.semanticdesigns.com/Products/PARLANSE).
PARLANSE compiles partial-order parallelism down to native x86 machine
code.


DMS isn't a compiler by itself but could be used to build a compiler.
But it has been used to build pretty interesting translators (JOVIAL
to C for B-2 bomber, see
http://www.semanticdesigns.com/Products/Services/NorthropGrummanB2.html)
and a wide variety of program analysis and modification tools. (See
www.semanticdesigns.com website in general, and large-scale C++
code/avionics system automated rearchitecting see
http://linkinghub.elsevier.com/retrieve/pii/S0950584906001856)


DMS was designed to work with big systems of source code (we've
handled single systems of 35 million lines) of C code), and processing
something that big is just plain expensive. Thus you want lots of
cycles brought to the problem, and thus the excuse for parallel
foundations.


One obvious application of parallelism is the embarassingly parallel
initial processing of individual compilation units, which we usually
do. When you have 18,000 compilation units, throwing 8-16 CPUs
against them is a big win. But this isn't particularly hard to
organize. [The moderator has already noted that modern versions of
Make do this]. What is hard is composing the results of all that
initial processing in a useful way.


Much of DMS looks like conventional compiler technology: lexing,
parsing, AST building, symbol table building, control and data flow
analysis. Parts are unconventional: AST to text pretty printing,
attribute-grammar evaluation, source-to-source transformations.


We see all this as "symbolic computation", e.g., we're not doing
arithmetic, but rather manipulating formulas and graph structures,
over big structures. That suggests that there's lot of opportunities
for parallelism in doing local graph-like computations, and PARLANSE
is an attempt to take advantage of that. It does so by offering
extremely low context-switching overhead computation grains that can
be scheduled for execution in parallel partial-orders. The low
overhead makes it more likely that one can find a small bit of code
that is worth scheduling.


One key place we use parallelism is in attribute grammar evaluation.
AGs specify synthesized and inherited computations on a per-grammar
rule basis. Because AGs are functional (DMS's can be, but also allows
side effects), doing a data flow analyis over AG computations for a
rule is technically easy. We use DMS (gee, a program analysis
system!) to do this analysis and synthesize parallel partial order for
processing an AST tree node in PARLANSE :-} When this gets executed
later as children are forked, partial orders for children node become
available. The amount of apparant parallism is doing AG evaluation of
a tree is pretty amazing. This works fairly well; we still have to
tune the partial orders to control some of the overhead. So, DMS
generates parallel evaluators for any AG computation. It is
interesting to note that building symbol tables can be cast as AG
computations over a parse tree, and we invariably implement it that
way, so symbol table construction is inherently parallel with DMS.


One of the more interesting applications of PARLANSE is in doing
system-scale parsing with full name resolution of Java applications.
There DMS may be told to analyze a large Java application (e.g., build
a call graph), which is has to do by reading all the Java sources. It
does this by starting with some designated Java files as roots, which
in turn include packages, .... spreading out into a graph of
cross-connected Java source modules. The name resolver for any
particular Java file waits for the parse tree to be constructed
(that's trivial), constructs the symbol spaces recursively for that
file as futures, and then fills in the symbol table entries. This
allows a package reference to start parsing as soon as the package
phrase is seen, and allows name resolution in one file to act in
dynamical parallelism with parsing and name resolution in other files.
[This is all on top of the inherent parallelism in the construction of
individual symbol spaces]. We've seen linear speedups of parsing
large Java systems with this.


We are presently doing points-to analysis to enhance other custom
analyses of interest to our customers. Doing this for 35 millions of
C serially takes literally days on a modern CPU. We've done some
expermental work with parallelizing that. To date, we have not
achieved good results yet; the fine grain compuations interact too
much. We have some ideas how to fix this and continue to work on it.
Here's a nice research project for some University.


Once you done generic compiler front end stuff, there's the
opportunity to use parallelism for whatever custom application you
care to do. One of our COTS tools, CloneDR, is a duplicated code
detector (www.semanticdesigns.com/Products/Clone) and computes clones
by matching syntax trees
(http://www.semanticdesigns.com/Company/Publications/ICSM98.pdf) The
CloneDR uses parallelism to match the many sets of possible clones.


A paper describing DMS with some discussion of how PARLANSE fits in
can be found at
http://www.semanticdesigns.com/Company/Publications/DMS-for-ICSE2004-reprint.pdf


More detailed discussions of various applications of PARLANSE in
compiler-like tasks can be found at
http://www.semanticdesigns.com/Company/Publications/parallelism-in-symbolic-computation.pdf


Putting PARLANSE together as a programming langauge to provide a
robust foundation has been a colossal headache, as you can imagine.
We designed it back in 1995 (with the explicit intention of using to
code DMS which then was just a figment of our arguably insane
imagination), and have implemented all of DMS in (some 500K SLOC of
hand written code, and mountains [I guess 5 MSLOC] of
DMS-generates-itself code with that partial-order-parallelism. We've
been battling the problem of making DMS useful for the last 10 years
[building a full C++ capable parser, let alone transformer is wayyyy
tough] and have consequently put less emphasis on extracting all the
possible parallelism with PARLANSE. Now that 8 and 16 way CPUs are
street-available, we'll be expending more energy on tuning this. We
do see some interesting speedups. We still have a lot to do. But the
tools are there and the DMS system architeture is wrapped the notion
that parallelism is fundamentally important.


A place we're trying to go is whole-program compilation, following the
simple idea that you can burn lots of (available!) cycles to do
whole-program analysis and then do good code generation.


A moral from all this is, if you want to have big, parallel
infrastructure for a compiler, you better have started a long time ago
because big infrastructures by themselves are really hard to build,
and you need to have taken parallelism into account from the
beginning. Its very hard to believe you can take an existing big
infrastructure (say, the SGI compiler) and post-parallelize it.
There's virtually no examples of complex systems that were implemented
as serial code, successfully post-parallelized.


Maybe the functional language guys have a chance to build parallel
compilers. They still have to build all compiler machinery, and they
still have to deal with real languages (e.g., C++). I don't think
this is likely to happen anytime soon.


--
Ira Baxter, CTO
www.semanticdesigns.com


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.