Paper: PR2: Peephole Raw Pointer Rewriting with LLMs for Translating C to Safer Rust

John R Levine <johnl@taugh.com>
Fri, 09 May 2025 12:27:11 -0400

          From comp.compilers

Related articles
Paper: PR2: Peephole Raw Pointer Rewriting with LLMs for Translating C to Safer Rust johnl@taugh.com (John R Levine) (2025-05-09)
Re: Paper: PR2: Peephole Raw Pointer Rewriting with LLMs for Translating C to Safer Rust derek-nospam@shape-of-code.com (Derek) (2025-05-13)
Re: Paper: PR2: Peephole Raw Pointer Rewriting with LLMs for Translating C to Safer Rust arnold@freefriends.org (2025-05-14)
Re: Paper: PR2: Peephole Raw Pointer Rewriting with LLMs for Translating C to Safer Rust 643-408-1753@kylheku.com (Kaz Kylheku) (2025-05-14)
Re: Paper: PR2: Peephole Raw Pointer Rewriting with LLMs for Translating C to Safer Rust anton@mips.complang.tuwien.ac.at (2025-05-15)
Re: Paper: PR2: Peephole Raw Pointer Rewriting with LLMs for Translating C to Safer Rust gneuner2@comcast.net (George Neuner) (2025-05-15)
Re: Paper: PR2: Peephole Raw Pointer Rewriting with LLMs for Translating C to Safer Rust christopher.f.clark@compiler-resources.com (Christopher F Clark) (2025-05-16)
[2 later articles]
| List of all articles for this month |
From: John R Levine <johnl@taugh.com>
Newsgroups: comp.compilers
Date: Fri, 09 May 2025 12:27:11 -0400
Organization: Compilers Central
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="9767"; mail-complaints-to="abuse@iecc.com"
Keywords: C, Rust, optimize
Posted-Date: 09 May 2025 12:30:28 EDT

Automated tools translate C to Rust but produce lousy Rust code because of
C's loose pointer semantics. They use an LLM to improve it somewhat.


Abstract
There has been a growing interest in translating C code to Rust due to
Rust's robust memory and thread safety guarantees. Tools such as C2RUST
enable syntax-guided transpilation from C to semantically equivalent Rust
code. However, the resulting Rust programs often rely heavily on unsafe
constructs--particularly raw pointers--which undermines Rust's safety
guarantees. This paper aims to improve the memory safety of Rust programs
generated by C2RUST by eliminating raw pointers. Specifically, we propose
a peephole raw pointer rewriting technique that lifts raw pointers in
individual functions to appropriate Rust data structures. Technically, PR2
employs decision-tree-based prompting to guide the pointer lifting
process. Additionally, it leverages code change analysis to guide the
repair of errors introduced during rewriting, effectively addressing
errors encountered during compilation and test case execution. We
implement PR2 as a prototype and evaluate it using gpt-4o-mini on 28
real-world C projects. The results show that PR2 successfully eliminates
13.22% of local raw pointers across these projects, significantly
enhancing the safety of the translated Rust code. On average, PR2
completes the transformation of a project in 5.44 hours, at an average
cost of $1.46.


https://arxiv.org/abs/2505.04852


Regards,
John Levine, johnl@taugh.com, Taughannock Networks, Trumansburg NY
Please consider the environment before reading this e-mail. https://jl.ly


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.