Re: How do debuggers work?

bliss@sp64.csrd.uiuc.edu (Brian Bliss)
Thu, 5 Dec 91 20:44:52 GMT

          From comp.compilers

Related articles
[2 earlier articles]
Re: How do debuggers work? davea@quasar.wpd.sgi.com (1991-12-03)
Re: How do debuggers work? pardo@cs.washington.edu (1991-12-04)
Re: How do debuggers work? plains!ortmann@uunet.uu.net (1991-12-04)
Re: How do debuggers work? hasan@emx.utexas.edu (1991-12-04)
Re: How do debuggers work? meissner@osf.org (1991-12-05)
Re: How do debuggers work? gaynor@remus.rutgers.edu (1991-12-05)
Re: How do debuggers work? bliss@sp64.csrd.uiuc.edu (1991-12-05)
Re: How do debuggers work? cherrman@borland.com (1991-12-06)
Re: How do debuggers work? jnelson@gauche.zko.dec.com (1991-12-09)
Re: How do debuggers work? meissner@osf.org (1991-12-15)
Re: How do debuggers work? tedg@apollo.HP.COM (1991-12-19)
| List of all articles for this month |
Newsgroups: comp.compilers
From: bliss@sp64.csrd.uiuc.edu (Brian Bliss)
Keywords: debug, syntax
Organization: UIUC Center for Supercomputing Research and Development
References: 91-12-003 91-12-011
Date: Thu, 5 Dec 91 20:44:52 GMT

First, if you are not familar with how symbol table strings work, read
/usr/inlcude/stab.h and the man page for it.


The following is a grammar for the stab strings generated by a sun-3 cc
compiler. grammars for other compilers vary in some details. I've had to
modify it somewhat to get it to work on allaint machines. You can input
it directly into yacc. Note that you can't use a lexical analyzer to
tokenize the input stream. If you have a stab string such as "tag:T13",
for instance, the lexical analyzer cannot know whether "T13" is an
identifier by itself, or whether "T" and "13" should be broken up into
separate tokens.


-----------------------------------------------




NAME: [a-zA-Z_][a-zA-Z_0-9]*
INTEGER: [-][0-9][0-9]*
REAL: [+-][0-9]*(\.[0-9][0-9]*|)([eE]([+-]|)[0-9][0-9]*|)
STRING: ``.*''
BSTRING: .*


String:
        NAME `:' Class
        `:' Class


Class:
        `c' `=' Constant `;'
        Variable
        Procedure
        Parameter
        NamedType


Constant:
        `i' INTEGER
        `r' REAL
        `c' OrdValue
        `b' OrdValue
        `s' STRING
        `e' TypeId `,' OrdValue
        `S' TypeId `,' NumElements `,' NumBits `,' BSTRING


OrdValue:
        INTEGER


NumElements:
        INTEGER


NumBits:
        INTEGER


Variable:
        TypeId -- local variable of type TypeId
        `r' TypeId -- register variable of type TypeId
        `S' TypeId -- module variable of type TypeId (static global in C)
        `V' TypeId -- own variable of type TypeId (static local in C)
        `G' TypeId -- global variable of type TypeId


Procedure:
        Proc -- top level procedure
        Proc `,' NAME `,' NAME -- local to first NAME,
                                -- second NAME is corresponding ld symbol


.need 8
Proc:
        `P' -- global procedure
        `Q' -- local procedure (static in C)
        `I' -- internal procedure (different calling sequence)
        `F' TypeId -- function returning type TypeId
        `f' TypeId -- local function
        `J' TypeId -- internal function


Parameter:
        `p' TypeId -- value parameter of type TypeId
        `v' TypeId -- reference parameter of type TypeId


NamedType:
        `t' TypeId -- type name for type TypeId
        `T' TypeId -- C structure tag name for struct TypeId


TypeId:
        INTEGER -- Unique (per compilation) number of type
        INTEGER `=' TypeDef -- Definition of type number


TypeDef:
        INTEGER
        Subrange
        Array
        Record
        `e' EnumList `;' -- enumeration
        `*' TypeId -- pointer to TypeId
        `S' TypeId -- set of TypeId
        `d' TypeId -- file of TypeId
        ProcedureType
        `i' NAME `:' NAME `;' -- imported type ModuleName:Name
        `o' NAME `;' -- opaque type
        `i' NAME `:' NAME `,' TypeId `;'
        `o' NAME `,' TypeId `;'


Subrange:
        `r' TypeId `;' INTEGER `;' INTEGER


Array:
        `a' TypeId `;' TypeId -- array [TypeId] of TypeId
        `A' TypeId -- open array of TypeId
        `D' INTEGER `,' TypeId -- N-dim. dynamic array
        `E' INTEGER `,' TypeId -- N-dim. subarray


ProcedureType:
        `f' TypeId `;' -- C function type
        `f' TypeId `,' NumParams `;' TParamList `;'
        `p' NumParams `;' TParamList `;'


NumParams:
        INTEGER


Record:
        `s' ByteSize FieldList `;' -- structure/record
        `u' ByteSize FieldList `;' -- C union


ByteSize:
        INTEGER


FieldList :
        Field
        FieldList Field


Field:
        NAME `:' TypeId `,' BitOffset `,' BitSize `;'


BitSize:
        INTEGER


BitOffset:
        INTEGER


EnumList:
        Enum
        EnumList Enum


Enum:
        NAME `:' OrdValue `,'


ParamList:
        Param
        ParamList Param


Param:
        NAME `:' TypeId `,' PassBy `;'


PassBy:
        INTEGER


TParam:
        TypeId `,' PassBy `;'


TParamList :
        TParam
        TParamList TParam


Export:
        INTEGER ExportInfo




----------------------------------------------




I wish they would have changed
the production variable->TypeId to Variable->'l' TypeId
                                                                                            ^ or another unsued letter
to make it easier to change the grammar without introduction conflicts.


On sparcs, you need to add the productions


TypeId->(INTEGER,INTEGER)


and


TypeDef->(INTEGER,INTEGER)


instead of every type having a unique integer identifier (unique to the
compilation unit), It has a pair of them. The first is the compilation
unit number, the second is an integer identifier unique to that
compilation unit. So if stdio.h is include file number 5, for example, and
struct _iobuf is type number 13 in that file, you would have a stab string
"_iobuf:T(5,13)= ..."


Now if, in a later compilation unit, stdio.h is included, and the type
information is identical to what it is in a previous compilation unit,
then instead of redeclaring _iobuf, an N_EXCL stab appears, and the type
information for the previous (or first - never figured this one out)
appearance of stdio.h would be used. If stdio.h was included as the 7th
included file in this latter compilation unit, the struct _iobuf now would
have the type numbers (7,13).


This optimization by the linker drastically reduces the size of the symbol
table, and the time to parse it all (if you decide not to take the dbx
approach of parsing only the information needed to execute a specific
command). However, if you have struct/union tag which are not
incompletely defined in one include file, and then completely defined in
another, reversing the order of #including the two files which cause the
type information to appear different, and the linker will not make the
optimization.


The sparc f77 compiler does not do any of this.


If you want the version of this grammar modified for alliant machines (the
same grammar will work on sparcs, and gcc -g (almost):


% ftp sp2.csrd.uiuc.edu
% get pub/ae.tar.Z


uncompress, untar, and look at the file src/ae_stab_parse.y


  - brian
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.