Fri, 19 May 2023 03:44:04 -0000 (UTC)

Related articles |
---|

[12 earlier articles] |

Re: binary search debugging of compilers 864-117-4973@kylheku.com (Kaz Kylheku) (2023-05-17) |

Re: binary search debugging of compilers 864-117-4973@kylheku.com (Kaz Kylheku) (2023-05-17) |

Re: binary search debugging of compilers gah4@u.washington.edu (gah4) (2023-05-17) |

Re: binary search debugging of compilers spibou@gmail.com (Spiros Bousbouras) (2023-05-18) |

Re: binary search debugging of compilers rsc@swtch.com (Russ Cox) (2023-05-18) |

Re: binary search debugging of compilers 864-117-4973@kylheku.com (Kaz Kylheku) (2023-05-19) |

Re: binary search debugging of compilers 864-117-4973@kylheku.com (Kaz Kylheku) (2023-05-19) |

binary search debugging of compilers cclick0@gmail.com (Cliff Click) (2023-05-19) |

binary search debugging of compilers tekk.nolagi@gmail.com (Max B) (2023-05-19) |

Re: binary search debugging of compilers tkoenig@netcologne.de (Thomas Koenig) (2023-05-19) |

Re: binary search debugging of compilers mrs@kithrup.com (2023-05-20) |

Re: binary search debugging of compilers gah4@u.washington.edu (gah4) (2023-05-20) |

From: | Kaz Kylheku <864-117-4973@kylheku.com> |

Newsgroups: | comp.compilers |

Date: | Fri, 19 May 2023 03:44:04 -0000 (UTC) |

Organization: | A noiseless patient Spider |

References: | <CADSkJJVN7RGqhgVDZaz_K5be6uEDaMofMu5OF3RR4Y5-fDu00Q@mail.gmail.com> 23-05-013 23-05-018 |

Injection-Info: | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="18481"; mail-complaints-to="abuse@iecc.com" |

Keywords: | debug, tools |

Posted-Date: | 19 May 2023 15:51:14 EDT |

On 2023-05-18, Spiros Bousbouras <spibou@gmail.com> wrote:

*> On Wed, 17 May 2023 10:55:27 -0400*

*> Russ Cox <rsc@swtch.com> wrote:*

*>> When BitKeeper came along, we added this:*

*>>*

*>> --longest Restrict the deltas to those on the longest line*

*>> between the two range endpoints. Unlike a range, the*

*>> lower bound is included in the output.*

*>>*

*>> because BitKeeper history is a lattice, not a straight line. So your*

*>> test points became*

*>>*

*>> bk changes -nd:REV: > REVS*

*>>*

*>> and binary search over those. That gets you the longest "straight" line*

*>> in the graph.*

*>*

*> Any partial ordering can be extended to a total ordering. Why can't you*

*> just do that instead of using 'the longest "straight" line' ?*

I think, not without introducing tie breaker rules that don't

necessarily make any sense.

But, maybe that doens't matter as something else. Say we have:

D-------E

/ \

A---B---C I----J

\ /

F---G---H

A is before B and so on. Are F-G-H before D-E or after?

Suppose the bug was introduced by E. We would like to know that.

The longest path between the A-J bisection points goes through F-G-H,

ignoring D-E. So it may wrongly look like I is the culprit, if E is

never tested. I has the bug, H doesn't.

Say we pick a precedence and flatten everything, in these

two ways:

A---B---C---D---E---F---G---H---I----J

or

A---B---C---F---G---H---D---E---I----J

problem is, under the first order, the bug is bouncing into

and out of existence:

A---B---C---D---E---F---G---H---I----J

X X X

E has the bug, and so do I and J. But F-G-H do not. So the bisect

algorithm has a problem; it is predicated on the idea that the bug

exists in the upper endpoint, is absent in the lower and has appeared

once in between.

Now, if we actually cherry pick all those commits into that order

(solving whatever conflicts are required), then we can do a meaningful

bisect. Then the rebased F G H commits do have the bug:

A---B---C---D---E---F'--G'--H'--I'---J

X X X X X X

It's unthinkable to be bisecting some lineage whipped up on-the-fly

that doesn't actually exist in the repo.

The best thing is to just keep the actual history that way.

Or else, do that longest-path things when bisecting. Then

if a merge point is implicated as the bad commit, then go

into the branch. When I is implicated, we note that it

has two parents, E and H. Only the H parentage was tested.

C was good. Thus we bisect C to E.

A smarter bisect algorithm could do this.

To make this sort of topical for comp.compilers, we note that the

revision history is a graph which resembles the control flow

graph of a program (one with no backwards branches).

The graph can be divided into our favorite chunks: basic blocks.

Basic blocks are sequence of commits in which no commit has multiple

parents (other than the first one), or is the parent of multiple children

(other than the last one).

The bisect algorithm should divide the graph into basic blocks,

and recurse into then when necessary.

So in:

D-------E

/ \

A---B---C I----J

\ /

F---G---H

we have these basic blocks:

/

A---B---C

\

D---E

\

/

F---G---H

I---J

First, the bisect could process the longest-path basic blocks: those

headed by A, F and I. The goal is to determine which basic block

introduces the bug.

When a buggy basic block is identified, then bisect actually

does commit-granularity bisection in that block.

If we discover that the bad commit is the head of a block,

and that block has multiple parents, we then recurse over

those parents somehow.

We identify paths (through the basic block graph) we have not taken

and similarly analyze them.

--

TXR Programming Language: http://nongnu.org/txr

Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Mastodon: @Kazinator@mstdn.ca

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.