Related articles |
---|
Control Flow Analysis for Languages with Exceptions (e.g. Java) owong@castortech.com (Oliver Wong) (2006-11-04) |
Re: Control Flow Analysis for Languages with Exceptions (e.g. Java) DrDiettrich1@aol.com (Hans-Peter Diettrich) (2006-11-05) |
Re: Control Flow Analysis for Languages with Exceptions (e.g. Java) ajonospam@andrew.cmu.edu (Arthur J. O'Dwyer) (2006-11-05) |
Re: Control Flow Analysis for Languages with Exceptions (e.g. Java) int2k@gmx.net (Wolfram Fenske) (2006-11-05) |
Re: Control Flow Analysis for Languages with Exceptions (e.g. Java) usenet@gmx.info (Michael Klemm) (2006-11-15) |
From: | "Wolfram Fenske" <int2k@gmx.net> |
Newsgroups: | comp.compilers |
Date: | 5 Nov 2006 23:36:54 -0500 |
Organization: | Compilers Central |
References: | 06-11-017 |
Keywords: | analysis, errors |
Posted-Date: | 05 Nov 2006 23:36:54 EST |
"Oliver Wong" <owong@castortech.com> writes:
[...]
> It looks like the literature all agrees that the first step to
> CFA, once you have the AST, is to determine what the basic blocks are
> for your code. From what I understand a basic block is a sequence of
> instructions such that if any one of those instructions are executed,
> then all the instructions in that sequence are executed.
Yes.
> In the case of Java (and probably other languages with exception,
> though I don't have experience with any of them), every instruction
> would be its own basic block, because an exception could potentially
> be thrown at any point, thus guaranteeing that there does not exist
> any pair of points such that you could be certain that the second
> instruction would execute, given that the first was executed.
This is where you make a mistake. Exceptions cannot potentially be
thrown at any point. Consider this example:
--8<---------------cut here---------------start------------->8---
c = a + b;
d = a * (a + b);
--8<---------------cut here---------------end--------------->8---
There is no way an exception could be thrown in this block, at least
not in Java. In a dynamically typed language, a type error might be
raised, though.
> In particular, Java has an Throwable java.lang.ThreadDeath which
> is "thrown in the victim thread when the stop method with zero
> arguments in class Thread is called"
> (http://java.sun.com/j2se/1.4.2/docs/api/java/lang/ThreadDeath.html)
>
> It seems pretty pointless to have every instruction be in its own
> basic block, so I was wondering how can the basic-block system be
> reconciled with languages which support Exceptions?
I don't see how the possibility that an exception could be thrown is
such a big problem. Let's assume that a type error was raised in the
first statement in the example above. How does that affect the other
statement? The control flow leaves the block anyway, "d = a * (a +
b);" just never gets computed. I could still optimize this code into
--8<---------------cut here---------------start------------->8---
c = a + b;
d = a * c;
--8<---------------cut here---------------end--------------->8---
Hm, Wikipedia says [1]:
--8<---------------cut here---------------start------------->8---
Instructions that end a basic block include
[...]
* Instructions which may throw an exception
* Function calls can be at the end of a basic block if they may
not return, such as functions which throw exceptions or special
calls like C's longjmp and exit.
Instructions which begin a new basic block include
[...]
* Instructions following ones that throw exceptions
* Exception handlers
--8<---------------cut here---------------end--------------->8---
I guess I need an example to see how exceptions complicate data flow
and
control flow analysis. The only scenarios I can come up with are
these:
--8<---------------cut here---------------start------------->8---
// No exception handler
c = i_might_throw_an_exception();
d = a * c;
e = d + 1; // If an exception occurs in "i_might_throw_an_exception"
// we never get here, anyway. I. e. we get here after
// assigning the value of "a * c" to "d" or not at all.
// Exception handler
try { // B1
c = i_might_throw_an_exception();
d = a * c;
} except (FooError fe) { // B2
d = 1;
}
e = d + 1; // This statement can be reached either from basic block B1
// or B2 (or not at all if another kind of error occured).
// So?
--8<---------------cut here---------------end--------------->8---
I'm somewhat of a compiler newbie myself, so I might be wrong. Maybe
somebody else here can help out.
> As a bonus question: A lot of my interest in this is due to my
> trying to understand how the control flow analysis for the FindBugs
> software works (http://sourceforge.net/projects/findbugs). FindBugs
> defines a class called Location, and puts in the documentation:
>
> <quote>
> Because of JSR subroutines, the same instruction may actually
> be part of multiple basic blocks (with different facts
> true in each, due to calling context)
> </quote>
I guess in order to understand that, one has to know what JSR
subroutines are. I don't, so I have no clue what they mean.
Wolfram
Footnotes:
[1] <http://en.wikipedia.org/wiki/Basic_block>
Return to the
comp.compilers page.
Search the
comp.compilers archives again.