Re: Automatic Structuring of Unstructured Programs (for C)

adegrey@abcl.co.uk (Aubrey de Grey)
Tue, 5 May 1992 09:46:05 GMT

          From comp.compilers

Related articles
Automatic Structuring of Unstructured Programs (for C) hendren@lucy.cs.mcgill.ca (1992-05-04)
Re: Automatic Structuring of Unstructured Programs (for C) adegrey@abcl.co.uk (1992-05-05)
Re: Automatic Structuring of Unstructured Programs (for C) moss@cs.umass.edu (1992-05-05)
| List of all articles for this month |

Newsgroups: comp.compilers
From: adegrey@abcl.co.uk (Aubrey de Grey)
Keywords: C, optimize
Organization: EO Computer Limited, Cambridge, UK
References: 92-05-023
Date: Tue, 5 May 1992 09:46:05 GMT

Here's a summary of a method I developed for this problem, for use with a
verification system. I think it answers your question.


Consider a single-entry, single-exit code segment (hereafter called a
routine), with jumps. It is properly structured iff no pair of jumps are
linked, ie enclose exactly one of each other's ends. Given a linked pair,
the way to unlink them is just to reverse the order of two ends. Yes
really...


Most cases of linked jumps have the jump sources adjacent -- ie the order
of ends is either S1,S2,D1,D2 or D1,S2,S1,D2 or D1,D2,S1,S2 (D for
destination, S for source). The transformation of these consists of
adding one new jump (unlinked to either other) and reversing the order of
S1 and S2, giving (for the last case) D1,D2,S3,D3,S2,S1. The
transformation is as follows:


statement1; <-------
label1; |
statement2; <---- |
label2; | |
statement3; | |
if condition1 then goto label1; >------
statement4; |
if condition2 then goto label2; >---
statement5;




becomes:




statement1;
label1; <--------------------------
statement2; |
label2; <----------------------- |
statement3; | |
flag1 := condition1; | |
if flag1 then goto label3; >--- | |
statement4; | | |
label3; <--- | |
if (condition2 AND NOT flag1) then goto label2; >-- |
if flag1 then goto label1; >-------------------------
statement5;




Hey presto, unlinked. The crucial point is that it doesn't matter where
D1 and D2 are -- exactly the same transformation works on S1,S2,D1,D2 and
on D1,S2,S1,D2. Hereafter this is called a "source-source exchange". It
can be done whenever the sources of two linked jumps are
"spaghetti-consecutive", ie the code between them doesn't include exactly
one end of any third jump.


The only case for which this doesn't work is S1,D2,D1,S2, ie
multiple-entry loops. Here there's no spag-consecutive pair of sources.
So, we exchange something else. Of the three choices (SD,DD or DS) it
turns out that source- destination exchanges are the most straightforward:


b1; If c1 THEN s1; b2; d2; b3 ("b" stands for block, with no jump ends)
                                                                      ("c" stands for jump condition)
becomes:


b1; f1 := c1;
IF f1 THEN s3;
b2;
d3;
d2;
IF f1 THEN s1;
b3


together with an assignment of f1 to FALSE immediately before s2, wherever
that happens to be.


There are pathological cases in which you have linked jumps but no
instance of either of the above. I think the simplest is
s1,d2,s3,d4,s2,d1,s4,d3. In this case, the trick is to do a SD exchange
on UNLINKED jumps, which unlocks the problem. Never do a linking SS swap.


The proof that this always succeeds is pretty easy. Consider the degree
of spaghettification as two numbers, (a) the sum of the numbers of sources
above each destination (ignoring totally unlinked jumps, of course), and
(b) the number of linked pairs of jumps. The exchanges are guaranteed to
have one of two effects on these numbers: either (a) is reduced, by SD
swap, or (b) is reduced and (a) unchanged, by SS swap which is never
linking. The extra jump that a swap introduces is never linked to any
other, so it doesn't affect the calculation. Thus the two-digit number
"ab" is reduced by each swap, eventually becoming zero == total
despaghettification.


Cheers, Aubrey
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.