How can a disassembler tell code from data?

dwex@mtgzfs3.att.com (David E Wexelblat)
Thu, 9 May 91 12:46:10 EDT

          From comp.compilers

Related articles
How can a disassembler tell code from data? dwex@mtgzfs3.att.com (1991-05-09)
Re: How can a disassembler tell code from data? rfg@ncd.com (1991-05-11)
| List of all articles for this month |
Newsgroups: comp.compilers
From: dwex@mtgzfs3.att.com (David E Wexelblat)
Keywords: disassemble, design, question
Organization: AT&T Bell Laboratories
Date: Thu, 9 May 91 12:46:10 EDT

I am working on fixing a rather broken disassembler for the 680x0 series
(which is irrelevant to my general problem, but may help find a specific
answer). My problem is trying to disassemble code compiled with GCC,
which puts constant character strings into the text segment. The program
correctly figures out that this stuff is not executable code by tracing
all of the paths through the code. But it cannot tell the difference
between word and byte data.


I think this is a general problem with disassembling any non-split-I/D
program. I was wondering if there are any techniques for determining that
a given piece of data should be interpreted as a character string as
opposed to word data. I would like a general-case answer, but the
following constraints can be applied, if necessary:


1) 680x0 processor
2) C compiler
- AT&T UNIX-PC v3.51 (which doesn't generally do this)
- gcc
3) COFF format object files
- stripped
- with symbols
- with relocation
- with debugging


I had though about using 'strings' type algorithm, but this is prone to
generating garbage, so I'm looking for something better.
--
David Wexelblat | dwex@mtgzz.att.com
AT&T Bell Laboratories | ...!att!mtgzz!dwex
200 Laurel Ave - 4B-421 |
Middletown, NJ 07748 | (201) 957-5871
[In the absence of extensive symbol table info, this sounds like a tough
problem. -John]
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.