Paper: On the Feasibility of Deduplicating Compiler Bugs with Bisection

John R Levine <johnl@taugh.com>
Tue, 01 Jul 2025 12:12:29 -0400

          From comp.compilers

Related articles
Paper: On the Feasibility of Deduplicating Compiler Bugs with Bisection johnl@taugh.com (John R Levine) (2025-07-01)
| List of all articles for this month |
From: John R Levine <johnl@taugh.com>
Newsgroups: comp.compilers
Date: Tue, 01 Jul 2025 12:12:29 -0400
Organization: Compilers Central
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="78903"; mail-complaints-to="abuse@iecc.com"
Keywords: errors, debug, paper
Posted-Date: 01 Jul 2025 12:16:42 EDT

Abstract


Random testing has proven to be an effective technique for compiler
validation. However, the debugging of bugs identified through random
testing presents a significant challenge due to the frequent occurrence of
duplicate test programs that expose identical compiler bugs. The process
to identify duplicates is a practical research problem known as bug
deduplication. Prior methodologies for compiler bug deduplication
primarily rely on program analysis to extract bug-related features for
duplicate identification, which can result in substantial computational
overhead and limited generalizability. This paper investigates the
feasibility of employing bisection, a standard debugging procedure largely
overlooked in prior research on compiler bug deduplication, for this
purpose. Our study demonstrates that the utilization of bisection to
locate failure-inducing commits provides a valuable criterion for
deduplication, albeit one that requires supplementary techniques for more
accurate identification. Building on these results, we introduce BugLens,
a novel deduplication method that primarily uses bisection, enhanced by
the identification of bug-triggering optimizations to minimize false
negatives. Empirical evaluations conducted on four real-world datasets
demonstrate that BugLens significantly outperforms the state-of-the-art
analysis-based methodologies Tamer and D3 by saving an average of 26.98%
and 9.64% human effort to identify the same number of distinct bugs. Given
the inherent simplicity and generalizability of bisection, it presents a
highly practical solution for compiler bug deduplication in real-world
applications.


https://arxiv.org/abs/2506.23281


Regards,
John Levine, johnl@taugh.com, Taughannock Networks, Trumansburg NY
Please consider the environment before reading this e-mail. https://jl.ly


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.