Re: Making a partial C compiler

blackmarlin@asean-mail.com (C)
24 May 2003 20:14:30 -0400

          From comp.compilers

Related articles
Making a partial C compiler cyberheg@l115.langkaer.dk (John Eskie) (2003-05-18)
Re: Making a partial C compiler torbenm@diku.dk (2003-05-23)
Re: Making a partial C compiler matt@peakfive.com (Matt Rosing) (2003-05-24)
Re: Making a partial C compiler blackmarlin@asean-mail.com (2003-05-24)
Re: Making a partial C compiler idbaxter@semdesigns.com (2003-05-24)
Re: Making a partial C compiler cyberheg@l115.langkaer.dk (John Eskie) (2003-05-29)
Re: Making a partial C compiler vbdis@aol.com (2003-06-03)
Re: Making a partial C compiler lars@bearnip.com (2003-06-03)
Re: Making a partial C compiler boldyrev@cgitftp.uiggm.nsc.ru (Ivan Boldyrev) (2003-06-03)
Re: Making a partial C compiler jyrixx@astro.temple.edu (2003-06-03)
[4 later articles]
| List of all articles for this month |

From: blackmarlin@asean-mail.com (C)
Newsgroups: comp.compilers
Date: 24 May 2003 20:14:30 -0400
Organization: http://groups.google.com/
References: 03-05-139
Keywords: tools, C
Posted-Date: 24 May 2003 20:14:30 EDT

"John Eskie" <cyberheg@l115.langkaer.dk> wrote
> I want to make a C/C++ source code obfuscator in C or C++. I don't
> need a full parser since I think about limited flow obfuscation. So
> what I need is to identify all "sub parts" of code like statements,
> if/while/for etc. pieces of code.


Sound like you want to write a lexer - all you really need is


#1: to read in tokens
#2: to identify where each variable is defined
        if from an #INCLUDEd file then keep the same
        else rename to something short and useless (like a, b, c ...)
#3: rewrite the source to a new file, removing comments
        and unnecessary white space characters.


This should generate a simple obfuscator, though a more complex one
would recognise and shorten or rearrange statements -- this later
version would be a complex piece of kit.


For #2 in flex you could code ...
[ \t\n\r] { printf( " " ); }
"//"[^\n]* {}
... /* more */
"#define" { printf( "#define " ); expectingNew = TRUE; }
"unsigned" { printf( "unsigned " ); expectingNew = TRUE; }
"int" { printf( "int " ); expectingNew = TRUE; }
... /* more */
[a-zA-Z_][a-zA-Z0-9_]* { if( expectingNew )
            printf( createObfuscateSymbol( yytext ) );
            else printf( getObfuscatedSymbol( yytext ) ); }


Of course this would be slow, but optimisation is possible.


> Therefore I need some parsing code that could ease my job. I'm sure
> there are several people who wrote parsers and configuration scripts
> for C and C++ parsing because they are rather common languages but in
> my search I didn't know what to look for.


As mentioned above, only commercial quality obfuscation
would require a full compiler; a simple tokeniser would
be enough to obfuscate a C programme.


> Maybe some of you can recommend me in what direction I should go or
> point me to articles on the subject. I don't plan to spend several
> months on this subject but the alternative is to write my own parsing
> code for my needs which probably won't be better then what exists
> already.


You should be able to knock together an obfuscator as described
above in only a couple of hours (using flex), or a few days us-
ing straight C (which could be more efficient).


The only problem with the solution outlined above is the source
will not be tested. This should not be a problem when your aim
is only to obfuscate know good code as a final stage before you
release it.


> Thanks in advance.
> --John


C 2002/5/20


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.