|Writing a C Compiler: lvalues firstname.lastname@example.org (=?ISO-8859-1?Q?Andr=E9_Wagner?=) (2010-05-08)|
|Re: Writing a C Compiler: lvalues email@example.com (Ben Bacarisse) (2010-05-09)|
|Re: Writing a C Compiler: lvalues firstname.lastname@example.org (bart.c) (2010-05-09)|
|Re: Writing a C Compiler: lvalues email@example.com (Tom St Denis) (2010-05-09)|
|Re: Writing a C Compiler: lvalues firstname.lastname@example.org (Keith Thompson) (2010-05-09)|
|Re: Writing a C Compiler: lvalues email@example.com (Eric Sosman) (2010-05-09)|
|Re: Writing a C Compiler: lvalues firstname.lastname@example.org (Stargazer) (2010-05-10)|
|Re: Writing a C Compiler: lvalues email@example.com (Marc van Lieshout) (2010-05-16)|
|Re: Writing a C Compiler: lvalues firstname.lastname@example.org (Eric Sosman) (2010-05-17)|
|Re: Writing a C Compiler: lvalues email@example.com (Keith Thompson) (2010-05-17)|
|Re: Writing a C Compiler: lvalues firstname.lastname@example.org (Keith Thompson) (2010-05-19)|
|Re: Writing a C Compiler: lvalues email@example.com (bart.c) (2010-05-19)|
|Re: Writing a C Compiler: lvalues firstname.lastname@example.org (2010-05-19)|
|[4 later articles]|
|Date:||Mon, 10 May 2010 00:44:28 -0700 (PDT)|
|Posted-Date:||11 May 2010 03:15:57 EDT|
On May 8, 4:34 pm, Andri Wagner <andre....@gmail.com> wrote:
> I'm writing a C compiler. It's almost over, except that is not
> handling lvalues correctly.
It's not "almost over" then :-)
> Let me show a example. The code "x = 5" (let's say 'x' was declared
> before) yields this in pseudo-assembly:
> mov $b, $fp+8 ; $fp+8 is 'x' addess, so I'm storing x's address in
> mov $a, 5
> mov [$b], $a ; here I'm putting what's in $a in the address
> pointed to $b
> Since 'x' is a lvalue in this case, I don't need its value, just the
> address of the variable.
> Now, if I want to access 'x' in the middle of a non-lvalue expressing,
> I would do:
> mov $a, $fp+8
> mov $a, [$a]
It looks as real x86 assembly and looks like you're jumping into
assembly generation too early.
> Notice how I get the varible addres, and from it, the value.
> What I'm trying to say is: the compiler yields different assembly code
> for when 'x' is a lvalue and when 'x' is not a lvalue.
> This gets more confusing when I have expressions such as 'x++'. This
> is simple, since 'x' is obviously a lvalue in this case. In the case
> of the compiler, I can parse 'x' and see that the lookahead points to
> '++', so it's a lvalue.
No, you can't assume that programmer always writes correct code. A
programmer may mistake, as in Eric's example, or he can write junk as
and compiler must be able to determine that an assignment to a non-
lvalue takes place.
> But what about '(x)++'? In this case, the compiler evaluates the
> subexpression '(x)', and this expression results the value of 'x', not
> the address. Now I have a '++' ahead, so how can I know the address of
> 'x' since all that I have is a value?
When I attempted at writing a C compiler (I wrote parser by hand), I
defined a "simpler C" pseudo-code - a subset of C, which allowed only
assignments in form "__temp_NN = &var;", "__temp_NN = *__temp_MM;",
"*__temp_NN = __temp_MM;", "__temp_NN = ~__temp_MM;" (instead of "~"
there could be "!" or "-") and "__temp_NN = var1 + var2;" (instead of
"+" there could be any arithmetic or logic binary operator). Also
allowed were conditional branches in form of "if (__temp_NN != 0) goto
xxx;" and unconditional branches ("goto xxx;"). "__temp_NN" were
temporary variables of suitable type for machine registers and if out
of registers they were added as additional local variables.
Then "x" and "address of x" would be evaluated separately, something
like "__temp_1 = x;", then at next sequence point: "__temp_2 = &x;
*__temp_2 = __temp_1". If "x" is not an l-value, during generation of
"__temp_2 = &x" compiler will fail parsing and show diagnostic.
Pseudo-code is a good thing, it allows easy debugging of the parser
and also - easy processing by optimizer. Pseudo-code should be defined
in a way that it answers C standard's requirements (think that if for
programmers the standard is a guide, for compiler's writer it's an
SRS) and that it includes only operations supported by any sensible
Note that while you don't need to care about anything that is
"undefined behavior" (the generated code needs not be meaningful), you
must add special rules processing for the standard's constraints.
Return to the
Search the comp.compilers archives again.