Re: Decompilers or comparison tools

"Gary H. Merrill" <ghm48805@glaxowellcome.com>
24 Sep 1997 22:30:54 -0400

          From comp.compilers

Related articles
Decompilers or comparison tools fd11@dial.pipex.com (Narinder Singh) (1997-09-23)
Re: Decompilers or comparison tools ghm48805@glaxowellcome.com (Gary H. Merrill) (1997-09-24)
Re: Decompilers or comparison tools ast@halcyon.com (1997-09-27)
Re: Decompilers or comparison tools root@kern.marine.su (Alexander S.Klenin) (1997-09-28)
Re: Decompilers or comparison tools cristina@it.uq.edu.au (Cristina Cifuentes) (1998-03-18)
| List of all articles for this month |
From: "Gary H. Merrill" <ghm48805@glaxowellcome.com>
Newsgroups: comp.compilers
Date: 24 Sep 1997 22:30:54 -0400
Organization: Glaxo Wellcome Inc.
References: 97-09-074
Keywords: tools, disassemble

Narinder Singh wrote:
> I am looking for either a decompiler for MS-DOS EXE's to C or BASIC, or a
> tool that will allow me to compare two object code files reporting the
> similarities or the differences. Does such a tool exist? if so where can I
> find it?
> ---
> Narinder Singh
> [This question shows up every few months. I've never seen one, other than
> an abandoned beta of a program from the Austin Code Works that produced C
> that was just transliterated assembler. If you want to compare objects,
> I'd disassemble them and try diff-ing the result. -John]


I know of at least one binary file comparator, but it is a proprietary
internal product of a previous employer. A true binary comparator
that reports minimal differences (e.g., shortest edit path) is *much*
more challenging than a comparator for text files (at least three of
which I have written). I won't begin to go into the details, but the
general problem is *very* thorny. Nonetheless, it would really be
nice to have such a tool, and you could make quite a name for yourself
by creating one. While from time to time I've wanted one myself, I've
never had the energy or sufficient motivation to pursue this lofty
goal.


However, for anyone interested, I can at least provide a pointer to
how you may get on this road to madness. It turns out (surprise) that
the problem reduces to that of comparing arbitrary strings. And this
problem is of rather intense interest in molecular biology (all those
hideous protein sequences, you know). So there is quite a bit of
literature on it, and a wealth of algorithms (each of which has its
own strengths and weaknesses). As a starting point, then, take a look
at Dan Gusfield's "Algorithms on Strings, Trees, and Sequences:
computer science and computational biology", Cambridge University
Press, 1997. Note that a lot of the problems facing computational
biologists in this arena can be ignored if the goal is simply (??!!) a
minimal differencing binary file comparator, since in such an
application you don't need to worry about approximate matching and can
be perfectly happy with exact matching techniques. That makes it much
simpler (uh huh).


Of course, this leaves aside the issue that when you compare objects
at the bit or byte level, the results may be a bit [no pun intended]
difficult to interpret without some sort of context. If what you
*really* want to do is to compare in some intelligible way the program
content or structure of files you *know* to be EXE files, you might
instead want to use what you know about the structure of such files to
write an intelligent comparison tool that would yield perhaps more
useful results.
--


Gary H. Merrill Principal Consultant
(919) 483-0973 Glaxo Wellcome Inc.
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.