Re: Making C compiler generate obfuscated code

glen herrmannsfeldt <gah@ugcs.caltech.edu>
Tue, 21 Dec 2010 19:20:28 +0000 (UTC)

          From comp.compilers

Related articles
[9 earlier articles]
Re: Making C compiler generate obfuscated code rpw3@rpw3.org (2010-12-18)
Re: Making C compiler generate obfuscated code DrDiettrich1@aol.com (Hans-Peter Diettrich) (2010-12-16)
Re: Making C compiler generate obfuscated code torbenm@diku.dk (2010-12-20)
Re: Making C compiler generate obfuscated code gneuner2@comcast.net (George Neuner) (2010-12-21)
Re: Making C compiler generate obfuscated code gneuner2@comcast.net (George Neuner) (2010-12-21)
Re: Making C compiler generate obfuscated code walter@bytecraft.com (Walter Banks) (2010-12-21)
Re: Making C compiler generate obfuscated code gah@ugcs.caltech.edu (glen herrmannsfeldt) (2010-12-21)
Re: Making C compiler generate obfuscated code martin@gkc.org.uk (Martin Ward) (2010-12-22)
Re: Making C compiler generate obfuscated code DrDiettrich1@aol.com (Hans-Peter Diettrich) (2010-12-22)
Re: Making C compiler generate obfuscated code gneuner2@comcast.net (George Neuner) (2010-12-23)
Re: Making C compiler generate obfuscated code torbenm@diku.dk (2011-01-04)
Re: Making C compiler generate obfuscated code gneuner2@comcast.net (George Neuner) (2011-01-06)
| List of all articles for this month |

From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Newsgroups: comp.compilers
Date: Tue, 21 Dec 2010 19:20:28 +0000 (UTC)
Organization: A noiseless patient Spider
References: 10-12-017 10-12-019 10-12-023 10-12-030 10-12-033
Keywords: C, code
Posted-Date: 21 Dec 2010 22:36:37 EST

Torben Fgidius Mogensen <torbenm@diku.dk> wrote:
(snip)


> Using jump tables and the like is, indeed, going to make
> unobfuscation hard. Especially if the tables change dynamically.


Yes for automated systems, but maybe not with a human in the loop.


If, for example, the target code is an interpreter then there is
likely a jump table for processing different statements. Finding that
table, then, tells you more than finding a bunch of conditional
branches.


> You might be able to get around this by symbolic execution: You start
> with a state description which allows arbitrary values of variables.


Well, you do have to find the end of the table, which often isn't hard
done by hand. (You can see when the addresses stop looking like
addresses. Also, there might be a conditional test on the table
offset.)


(snip)
> This process is similar to online partial evaluation, which also uses
> state descriptions and generalisation.


> That said, unobfuscation can never be perfect: Equivalence of programs
> is undecidable, so it is in theory possible to make a program so
> obfuscated that no automatic process can recover the original.


Again, it helps to have a human in the loop, especially one who
know the art of the design of similar programs.


One of my first, larger, disassemblies (for personal use) was the
BASIC interpreter for the TRS-80 Color Computer, MC6809 code.
I was working on it for a while without finding the main loop
that reads statement codes and branches. It turned out that such
loop is loaded into RAM at startup, and executes there. When
disassembling ROM, one doesn't always expect execution in RAM.


Anyway, I believe that mostly automated but with a human in the
loop is the best way to do disassembly and deobfuscation.


-- glen


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.