Re: Help on disassembler/decompilers

raulmill@usc.edu
Sun, 16 Sep 90 20:28:25 EDT

          From comp.compilers

Related articles
[17 earlier articles]
Re: Help on disassembler/decompilers kym@bingvaxu.cc.binghamton.edu.cc.binghamton.edu (1990-09-14)
Re: Help on disassembler/decompilers hankd@dynamo.ecn.purdue.edu (1990-09-14)
Re: Help on disassembler/decompilers hawley@icot32.icot.or.jp (1990-09-15)
Re: Help on disassembler/decompilers ch@dce.ie (1990-09-14)
Re: Help on disassembler/decompilers kym@bingvaxu.cc.binghamton.edu.cc.binghamton.edu (1990-09-15)
Re: Help on disassembler/decompilers roland@ai.mit.edu (1990-09-16)
Re: Help on disassembler/decompilers raulmill@usc.edu (1990-09-16)
Re: Help on disassembler/decompilers ch@dce.ie (1990-09-18)
Re: Help on disassembler/decompilers ctl8588@rigel.tamu.edu (1990-09-18)
Re: Help on disassembler/decompilers megatest!djones@decwrl.dec.com (1990-09-18)
Re: Help on disassembler/decompilers markh@csd4.csd.uwm.edu (1990-09-19)
Re: Help on disassembler/decompilers td@alice.UUCP (1990-09-21)
| List of all articles for this month |

Newsgroups: comp.compilers
From: raulmill@usc.edu
In-Reply-To: adamsf@turing.cs.rpi.edu's message of 10 Sep 90 22:20:33 GMT
Keywords: disassemble
Organization: Compilers Central
References: <HOW.90Sep5173755@sundrops.ucdavis.edu> <12976@june.cs.washington.edu> <_5A%GS%@rpi.edu>
Date: Sun, 16 Sep 90 20:28:25 EDT

In article <_5A%GS%@rpi.edu> adamsf@turing.cs.rpi.edu (Frank Adams)
writes:
In article <12976@june.cs.washington.edu> pardo@cs.washington.edu
(David Keppel) writes:
>My guess is that decompiling in to a language that is e.g.,
>saccarine-sweetened assembler (C) is `easy', while decompiling e.g.,
>in to APL is hard.


If we assume that the program is to be decompiled into the language in
which it was written, it is in general easier to decompile the less the
compiler optimizes the generated code.


A second problem is type inference. APL, with a fixed set of data types,
is easier in this respect than C. For example, when the code loads a pointer
into a register and indexes off of it, what kind of struct is the pointer
pointing to?


[Frank then goes on to state his opinion that C is pretty good for
exact transliteration of machine language.]


If I may point out...


[1] the first commercial use of APL was to describe the IBM 360 architecture.
APL has the ability to concisely describe just about any machine
architecture.


[2] As far as I know, the language analysis/verification tools available for
APL are pretty good [some would say better than those available for any other
language, but without first hand knowledge I'm not so sure. I do know that 7
or 8 years ago 3 bugs were found in that 360 description by one of these
verifiers.]


If you want an exact HLL transliteration of raw machine code, or a
translation into an assembler-like language, there is no reason why APL
should be harder than any other language (though I'd recommend using J
instead, because there is an odd sort of problem getting APL to talk in
ascii, and J is better IMHO :)


To turn back to the original poster's question, the best disassemblers
I have seen often do a lot of interpolation based on system calls
whose arguments are known, various compiler conventions and, if you
are lucky enough, linking/debugging information left in the code by
the developers.


As far as I've seen, the worst problem in converting from machine language to
other representations is figuring out what to call a specific piece of
memory. (code? text? struct? etc.) A lot of this information can be
interpolated by logic on the order of 'well, if this instruction is illegal,
we know everything back to the last branch isn't instructions.'
[It seems to me that [1] is a red herring, the IBM POO describes the 370
in English, but disassembling into English is difficult. On the other hand,
decompiling into scalar APL expressions shouldn't be hard. -John]


--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.