|Show line numbers in diagnostics for a scripting language - how can th email@example.com (Johannes Schaub \(litb\)) (2010-10-29)|
|Re: Show line numbers in diagnostics for a scripting language - how ca firstname.lastname@example.org (Ira Baxter) (2010-11-01)|
|Re: Show line numbers in diagnostics for a scripting language - how ca email@example.com (George Neuner) (2010-11-02)|
|Re: Show line numbers in diagnostics for a scripting language - how ca firstname.lastname@example.org (Johannes Schaub \(litb\)) (2010-11-15)|
|Re: Show line numbers in diagnostics for a scripting language - how ca email@example.com (BartC) (2010-11-06)|
|Re: Show line numbers in diagnostics for a scripting language - how ca firstname.lastname@example.org (George Neuner) (2010-11-09)|
|From:||"Ira Baxter" <email@example.com>|
|Date:||Mon, 1 Nov 2010 12:37:20 -0500|
|Posted-Date:||02 Nov 2010 17:31:44 EDT|
"Johannes Schaub (litb)" <firstname.lastname@example.org> wrote in message
> Hello all.
> I'm using LLVM, and I'm writing a backend for a closed-source compiler of
> a language.
> My problem is now - in the AST, I know what nodes correspond to what source
> lines and even what source columns. But I want to display that information
> (or at least the line-number information) in the diagnostic too.
> What are the usual ways to solve this? I have thought abou this, and one
> way could be to pass the line number along to the runtime functions like the
> call void @rtEmitAdd(i32 3, ; appeared in line 3
> %myvalue* %num1,
> %myvalue* %num2,
> %myvalue* %result)
> I wonder now - how is this generally solved for such languages? Thanks in
> [There's no generally satisfactory approach. One thing I've done is
> to embed the line numbers and routine names in no-op instructions
> after each call, so I can get the info via a stack traceback, but not
> affect runtime very much. Searching the debug symbols is not
> it's slow, but it doesn't matter since it only needs to be fast enough to
> display the results to a human user. -John]
A cheap trick like John's is to embed source line number information
directly *before* each function entry pointl, so it doesn't pollute
the execution stream. Examining a stack backtrace leads to return
addresses, which liads to function call instructions just before the
return address, making the line number information accessible. This
can give you function-level backtraces for a diagnostic.
That's not always enough; sometimes you want the diagnostic to be
line-precise or better.
A more complicated trick is to imagine that each machine instruction
is associated with file/line/column information. When you generate
the final machine code, you now have (abstractly) a map from machine
instruction addresses to source information, and you can easily build
a lookup table from machine instruction address to source information.
A sorted list allows a fast binary lookup to find the source location.
Many adjacent machine instructions share the same source location, and
so you can change the lookup table to use address ranges instead and
buy significant size reductions in the size of the lookup table. This
scheme can also be used to when trying to implement exception handlers
with zero overhead if exceptions don't occur; simply associate an
exception handler with each address range. Tracking this information
through the code generator can be tricky if you don't literally
associate a location with each instruction, but it can be done in
We've used this technique very successfully in our PARLANSE parallel
programming language. It produces nice backtraces (even for
diagnostics from parallel children). The exception handling scheme
produces pretty fast exception handling; we use an unrolled binary
search (it only has 24 iterations maximum!) with CMOVs rather than
I believe the Microsoft 64 bit runtime uses a similar table scheme for
Ira Baxter, CTO
Return to the
Search the comp.compilers archives again.