Paper: Large Language Model-Powered Agent for C to Rust Code Translation

John R Levine <johnl@taugh.com>
Fri, 23 May 2025 13:12:38 -0400

          From comp.compilers

Related articles
Paper: Large Language Model-Powered Agent for C to Rust Code Translation johnl@taugh.com (John R Levine) (2025-05-23)
| List of all articles for this month |
From: John R Levine <johnl@taugh.com>
Newsgroups: comp.compilers
Date: Fri, 23 May 2025 13:12:38 -0400
Organization: Compilers Central
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="32668"; mail-complaints-to="abuse@iecc.com"
Keywords: C, Rust, translator
Posted-Date: 23 May 2025 13:13:03 EDT

Another paper claims their LLM with feedback works pretty well.


Abstract


The C programming language has been foundational in building system-level
software. However, its manual memory management model frequently leads to
memory safety issues. In response, a modern system programming language,
Rust, has emerged as a memory-safe alternative. Moreover, automating the
C-to-Rust translation empowered by the rapid advancements of the
generative capabilities of LLMs is gaining growing interest for large
volumes of legacy C code. Despite some success, existing LLM-based
approaches have constrained the role of LLMs to static prompt-response
behavior and have not explored their agentic problem-solving capability.
Applying the LLM agentic capability for the C-to-Rust translation
introduces distinct challenges, as this task differs from the traditional
LLM agent applications, such as math or commonsense QA domains. First, the
scarcity of parallel C-to-Rust datasets hinders the retrieval of suitable
code translation exemplars for in-context learning. Second, unlike math or
commonsense QA, the intermediate steps required for C-to-Rust are not
well-defined. Third, it remains unclear how to organize and cascade these
intermediate steps to construct a correct translation trajectory. To
address these challenges in the C-to-Rust translation, we propose a novel
intermediate step, the Virtual Fuzzing-based equivalence Test (VFT), and
an agentic planning framework, the LLM-powered Agent for C-to-Rust code
translation (LAC2R). The VFT guides LLMs to identify input arguments that
induce divergent behaviors between an original C function and its Rust
counterpart and to generate informative diagnoses to refine the unsafe
Rust code. LAC2R uses the MCTS to systematically organize the LLM-induced
intermediate steps for correct translation. We experimentally demonstrated
that LAC2R effectively conducts C-to-Rust translation on large-scale,
real-world benchmarks.


https://arxiv.org/abs/2505.15858


Regards,
John Levine, johnl@taugh.com, Taughannock Networks, Trumansburg NY
Please consider the environment before reading this e-mail. https://jl.ly


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.