Re: Making C compiler generate obfuscated code

torbenm@diku.dk (Torben Ęgidius Mogensen)
Mon, 20 Dec 2010 12:53:51 +0100

          From comp.compilers

Related articles
[5 earlier articles]
Re: Making C compiler generate obfuscated code Pidgeot18@gmail.com (Joshua Cranmer) (2010-12-16)
Re: Making C compiler generate obfuscated code Pidgeot18@gmail.com (Joshua Cranmer) (2010-12-16)
Re: Making C compiler generate obfuscated code martin@gkc.org.uk (Martin Ward) (2010-12-17)
Re: Making C compiler generate obfuscated code gah@ugcs.caltech.edu (glen herrmannsfeldt) (2010-12-18)
Re: Making C compiler generate obfuscated code rpw3@rpw3.org (2010-12-18)
Re: Making C compiler generate obfuscated code DrDiettrich1@aol.com (Hans-Peter Diettrich) (2010-12-16)
Re: Making C compiler generate obfuscated code torbenm@diku.dk (2010-12-20)
Re: Making C compiler generate obfuscated code gneuner2@comcast.net (George Neuner) (2010-12-21)
Re: Making C compiler generate obfuscated code gneuner2@comcast.net (George Neuner) (2010-12-21)
Re: Making C compiler generate obfuscated code walter@bytecraft.com (Walter Banks) (2010-12-21)
Re: Making C compiler generate obfuscated code gah@ugcs.caltech.edu (glen herrmannsfeldt) (2010-12-21)
Re: Making C compiler generate obfuscated code martin@gkc.org.uk (Martin Ward) (2010-12-22)
Re: Making C compiler generate obfuscated code DrDiettrich1@aol.com (Hans-Peter Diettrich) (2010-12-22)
[3 later articles]
| List of all articles for this month |

From: torbenm@diku.dk (Torben Ęgidius Mogensen)
Newsgroups: comp.compilers
Date: Mon, 20 Dec 2010 12:53:51 +0100
Organization: SunSITE.dk - Supporting Open source
References: 10-12-017 10-12-019 10-12-023 10-12-030
Keywords: C, code
Posted-Date: 21 Dec 2010 09:35:14 EST

Hans-Peter Diettrich <DrDiettrich1@aol.com> writes:




> In practice such interruptions of the control flow make automatic
> disassembling almost impossible. Instead a good *interactive*
> disassembler is required (as I was writing when I came across above
> tricks), and time consuming manual intervention and analysis is
> required with almost every break in the control flow. The mix of data
> and instructions not only makes it impossible to generate an assembler
> listing, but also hides the use of memory locations (variables or
> constants), with pointers embedded in the inlined parameter
> blocks. Now tell me how a decompiler or other analysis tool should
> deal with such constructs, when already the automatic separation of
> code and data is impossible.


Using jump tables and the like is, indeed, going to make unobfuscation
hard. Especially if the tables change dynamically.


You might be able to get around this by symbolic execution: You start
with a state description which allows arbitrary values of variables.
You then symbolically execute and update the state description as you go
along. At any test, you split the state description into two new state
descriptions and continue symbolic execution on two individual paths.
Whenever symbolic execution reaches a previously-visited program point
with a state description equivalent to what was seen before, you make a
loop. To keep the set of state descriptions finite, you apply
generalisation when a state description gets too complicated.


This process is similar to online partial evaluation, which also uses
state descriptions and generalisation.


That said, unobfuscation can never be perfect: Equivalence of programs
is undecidable, so it is in theory possible to make a program so
obfuscated that no automatic process can recover the original.


But what if you know the obfuscation method? Assuming that the
obfuscation method is polynomic, deobfuscation is at worst NP-hard, so
it is decidable. But it can be so intractable that it doesn't matter.


Torben


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.