Re: [QUERY] Incremental dependencies

"Jan Gray" <jsgray@acm.org>
14 Jan 1997 20:12:13 -0500

          From comp.compilers

Related articles
[QUERY] Incremental dependencies jlilley@empathy.com (1997-01-09)
Re: [QUERY] Incremental dependencies dlmoore@ix.netcom.com (David L Moore) (1997-01-12)
Re: [QUERY] Incremental dependencies jsgray@acm.org (Jan Gray) (1997-01-14)
Re: [QUERY] Incremental dependencies lat@hpmoose6.cern.ch (Lassi Tuura) (1997-01-15)
| List of all articles for this month |

From: "Jan Gray" <jsgray@acm.org>
Newsgroups: comp.compilers
Date: 14 Jan 1997 20:12:13 -0500
Organization: Netcom
References: 97-01-056
Keywords: parse, performance

John Lilley <jlilley@empathy.com> wrote
> I've been pondering an issue that should be of practical importance to
> compiler writers, but seems to be implemented miniminally in
> commercial compilers (C++ compilers are my only data set).
>
> So I wonder: do any compilers/parsers, commercial or otherwise, make
> use of incremental or minimal dependency calculations


Sure, lots, but due to such delights as include-site context sensitive
header files, not to mention language spec churn, the C++ guys still
have a ways to go.


It is not well known, but Microsoft built an experimental incremental
C compiler called C# in 1988. It kept its state in a custom
object-oriented database and did fine grained dependency analysis,
incremental recompilation, and incremental linking. C# was not
shipped but it had a profound influence on subsequent products.


> Given, this is incredibly difficult, escpecially when scoping,
overloading, > and the preprocessor are thrown in. I have seen
commercial products that > will try one or both of:
> 1) Precompiled headers.
> 2) Skipping recompiles for a file under certain cases (MSVC++).


I designed or codesigned several such VC++ features. These include:
1. precompiled headers (C/C++7.0)
2. program database and incremental debug info update (VC++ 1.0)
3. incremental linking (VC++ 2.0)
4. incremental recompilation (VC++ 4.0)
5. "minimal rebuild" (VC++ 4.0)


Combined, these features often achieve <5s rebuilds on substantial
applications.


1. PCH is indispensable. MFC app sources typically include 100,000 lines
of headers.
2. The program database reposits symbols and types and more, and enables
3-5.
3. For our needs, incremental linking implied link-time incremental debug
info update.


4. Incremental compilation is incremental in both the parsing and code
gen phases, at the granularity of a function body. After the initial
compile, if some function is edited, the recompile skips
parsing+semantic analysis of the other, unchanged function bodies.
Some unchanged functions may be reparsed for good reasons. The
persistent intermediate rep is incrementally updated. Functions with
changed IR are recodegen'd. The object module is patched with new
code and symbols, and the program database is updated. This feature
provides a several times speedup on large compilands or when compiling
with optimization enabled.


5. Whereas VC++'s incremental compilation speeds up rebuilds of edited
source files, minimal rebuild speeds up rebuilds of sets of source
files after a header file edit. During a previous build, the compiler
noted just *how* each file depends upon its headers. Then, as the
header is recompiled in the context of one of its source files which
included it, the compiler determines *what* changed. It then skips
recompilation of any other unchanged source file which could not
depend upon the changes. Tricky.


Yes, compiling C++ is hard enough without having to figure out how to
skip 90% of the compilation, especially in the presence of myriad
features which have various effects upon both persistent and ephemeral
compiler state. Even with important simplifying assumptions, this was
complex and meticulous work, difficult both to build and to test.


Also, when building incremental *anything*, it is imperative that you
choose the appropriate granularity of change to handle incrementally.
For C/C++, I believe that file granularity is too coarse and that
statement level is too fine grained. Too many systems "in the labs"
did not scale up in the real world because they were too clever and
did too much work keeping too much state, trying too hard to avoid
relatively inexpensive operations such as reparsing a single function
or class declaration.


Sorry, none of the details have been published anywhere.


Jan Gray // former VC++ developer, now on "Viper"
[Robust Scaleable Application Servers Made Easy:
www.microsoft.com/transaction]
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.