Re: Show line numbers in diagnostics for a scripting language - how can this be done?

"Ira Baxter" <idbaxter@semdesigns.com>
Mon, 1 Nov 2010 12:37:20 -0500

          From comp.compilers

Related articles
Show line numbers in diagnostics for a scripting language - how can th schaub-johannes@web.de (Johannes Schaub \(litb\)) (2010-10-29)
Re: Show line numbers in diagnostics for a scripting language - how ca idbaxter@semdesigns.com (Ira Baxter) (2010-11-01)
Re: Show line numbers in diagnostics for a scripting language - how ca gneuner2@comcast.net (George Neuner) (2010-11-02)
Re: Show line numbers in diagnostics for a scripting language - how ca schaub-johannes@web.de (Johannes Schaub \(litb\)) (2010-11-15)
Re: Show line numbers in diagnostics for a scripting language - how ca bc@freeuk.com (BartC) (2010-11-06)
Re: Show line numbers in diagnostics for a scripting language - how ca gneuner2@comcast.net (George Neuner) (2010-11-09)
| List of all articles for this month |

From: "Ira Baxter" <idbaxter@semdesigns.com>
Newsgroups: comp.compilers
Date: Mon, 1 Nov 2010 12:37:20 -0500
Organization: Compilers Central
References: 10-10-038
Keywords: debug
Posted-Date: 02 Nov 2010 17:31:44 EDT

"Johannes Schaub (litb)" <schaub-johannes@web.de> wrote in message
> Hello all.
>
> I'm using LLVM, and I'm writing a backend for a closed-source compiler of
> a language.
[snip]
>
> My problem is now - in the AST, I know what nodes correspond to what source
> lines and even what source columns. But I want to display that information
> (or at least the line-number information) in the diagnostic too.
>
> What are the usual ways to solve this? I have thought abou this, and one
> way could be to pass the line number along to the runtime functions like the
> following
>
> call void @rtEmitAdd(i32 3, ; appeared in line 3
> %myvalue* %num1,
> %myvalue* %num2,
> %myvalue* %result)
>
> I wonder now - how is this generally solved for such languages? Thanks in
> advance!
> [There's no generally satisfactory approach. One thing I've done is
> to embed the line numbers and routine names in no-op instructions
> after each call, so I can get the info via a stack traceback, but not
> affect runtime very much. Searching the debug symbols is not
> unreasonable;
> it's slow, but it doesn't matter since it only needs to be fast enough to
> display the results to a human user. -John]


A cheap trick like John's is to embed source line number information
directly *before* each function entry pointl, so it doesn't pollute
the execution stream. Examining a stack backtrace leads to return
addresses, which liads to function call instructions just before the
return address, making the line number information accessible. This
can give you function-level backtraces for a diagnostic.


That's not always enough; sometimes you want the diagnostic to be
line-precise or better.


A more complicated trick is to imagine that each machine instruction
is associated with file/line/column information. When you generate
the final machine code, you now have (abstractly) a map from machine
instruction addresses to source information, and you can easily build
a lookup table from machine instruction address to source information.
A sorted list allows a fast binary lookup to find the source location.
Many adjacent machine instructions share the same source location, and
so you can change the lookup table to use address ranges instead and
buy significant size reductions in the size of the lookup table. This
scheme can also be used to when trying to implement exception handlers
with zero overhead if exceptions don't occur; simply associate an
exception handler with each address range. Tracking this information
through the code generator can be tricky if you don't literally
associate a location with each instruction, but it can be done in
practice.


We've used this technique very successfully in our PARLANSE parallel
programming language. It produces nice backtraces (even for
diagnostics from parallel children). The exception handling scheme
produces pretty fast exception handling; we use an unrolled binary
search (it only has 24 iterations maximum!) with CMOVs rather than
conditional jumps.


I believe the Microsoft 64 bit runtime uses a similar table scheme for
exception handling.
--
Ira Baxter, CTO
www.semanticdesigns.com



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.