Re: Control Flow Analysis for Languages with Exceptions (e.g. Java)

"Wolfram Fenske" <int2k@gmx.net>
5 Nov 2006 23:36:54 -0500

          From comp.compilers

Related articles
Control Flow Analysis for Languages with Exceptions (e.g. Java) owong@castortech.com (Oliver Wong) (2006-11-04)
Re: Control Flow Analysis for Languages with Exceptions (e.g. Java) DrDiettrich1@aol.com (Hans-Peter Diettrich) (2006-11-05)
Re: Control Flow Analysis for Languages with Exceptions (e.g. Java) ajonospam@andrew.cmu.edu (Arthur J. O'Dwyer) (2006-11-05)
Re: Control Flow Analysis for Languages with Exceptions (e.g. Java) int2k@gmx.net (Wolfram Fenske) (2006-11-05)
Re: Control Flow Analysis for Languages with Exceptions (e.g. Java) usenet@gmx.info (Michael Klemm) (2006-11-15)
| List of all articles for this month |

From: "Wolfram Fenske" <int2k@gmx.net>
Newsgroups: comp.compilers
Date: 5 Nov 2006 23:36:54 -0500
Organization: Compilers Central
References: 06-11-017
Keywords: analysis, errors
Posted-Date: 05 Nov 2006 23:36:54 EST

"Oliver Wong" <owong@castortech.com> writes:




[...]


> It looks like the literature all agrees that the first step to
> CFA, once you have the AST, is to determine what the basic blocks are
> for your code. From what I understand a basic block is a sequence of
> instructions such that if any one of those instructions are executed,
> then all the instructions in that sequence are executed.


Yes.


> In the case of Java (and probably other languages with exception,
> though I don't have experience with any of them), every instruction
> would be its own basic block, because an exception could potentially
> be thrown at any point, thus guaranteeing that there does not exist
> any pair of points such that you could be certain that the second
> instruction would execute, given that the first was executed.


This is where you make a mistake. Exceptions cannot potentially be
thrown at any point. Consider this example:


--8<---------------cut here---------------start------------->8---
c = a + b;
d = a * (a + b);
--8<---------------cut here---------------end--------------->8---


There is no way an exception could be thrown in this block, at least
not in Java. In a dynamically typed language, a type error might be
raised, though.


> In particular, Java has an Throwable java.lang.ThreadDeath which
> is "thrown in the victim thread when the stop method with zero
> arguments in class Thread is called"
> (http://java.sun.com/j2se/1.4.2/docs/api/java/lang/ThreadDeath.html)
>
> It seems pretty pointless to have every instruction be in its own
> basic block, so I was wondering how can the basic-block system be
> reconciled with languages which support Exceptions?


I don't see how the possibility that an exception could be thrown is
such a big problem. Let's assume that a type error was raised in the
first statement in the example above. How does that affect the other
statement? The control flow leaves the block anyway, "d = a * (a +
b);" just never gets computed. I could still optimize this code into


--8<---------------cut here---------------start------------->8---
c = a + b;
d = a * c;
--8<---------------cut here---------------end--------------->8---


Hm, Wikipedia says [1]:


--8<---------------cut here---------------start------------->8---
Instructions that end a basic block include


        [...]
        * Instructions which may throw an exception
        * Function calls can be at the end of a basic block if they may
            not return, such as functions which throw exceptions or special
            calls like C's longjmp and exit.


Instructions which begin a new basic block include


        [...]
        * Instructions following ones that throw exceptions
        * Exception handlers
--8<---------------cut here---------------end--------------->8---


I guess I need an example to see how exceptions complicate data flow
and
control flow analysis. The only scenarios I can come up with are
these:


--8<---------------cut here---------------start------------->8---
// No exception handler
c = i_might_throw_an_exception();
d = a * c;
e = d + 1; // If an exception occurs in "i_might_throw_an_exception"
                      // we never get here, anyway. I. e. we get here after
                      // assigning the value of "a * c" to "d" or not at all.


// Exception handler
try { // B1
    c = i_might_throw_an_exception();
    d = a * c;
} except (FooError fe) { // B2
    d = 1;
}
e = d + 1; // This statement can be reached either from basic block B1
                      // or B2 (or not at all if another kind of error occured).
                      // So?
--8<---------------cut here---------------end--------------->8---


I'm somewhat of a compiler newbie myself, so I might be wrong. Maybe
somebody else here can help out.


> As a bonus question: A lot of my interest in this is due to my
> trying to understand how the control flow analysis for the FindBugs
> software works (http://sourceforge.net/projects/findbugs). FindBugs
> defines a class called Location, and puts in the documentation:
>
> <quote>
> Because of JSR subroutines, the same instruction may actually
> be part of multiple basic blocks (with different facts
> true in each, due to calling context)
> </quote>


I guess in order to understand that, one has to know what JSR
subroutines are. I don't, so I have no clue what they mean.




Wolfram


Footnotes:
[1] <http://en.wikipedia.org/wiki/Basic_block>


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.