Related articles |
---|
Re: slighty off topic -- writing an assembler! kmcguire3413@hotmail.com (Leonard Kevin McGuire Jr.) (2006-07-16) |
slightly off topic -- writing an assembler! SAMIGWE@worldnet.att.net (samuel) (1998-06-24) |
From: | "Leonard Kevin McGuire Jr." <kmcguire3413@hotmail.com> |
Newsgroups: | comp.compilers |
Date: | 16 Jul 2006 10:43:48 -0400 |
Organization: | Compilers Central |
References: | 98-06-126 |
Keywords: | assembler |
Posted-Date: | 16 Jul 2006 10:43:48 EDT |
[Note that this is a followup to a thread from 1998. -John]
>>From: samuel <SAMIGWE@worldnet.att.net>
>>Date: 24 Jun 1998 00:04:30 -0400
>>Good day all:
I am currently working on writing an assembler (intel syntax
for the x86 microprocessor)for my operating system project. I haven't
yet had any formal training on the design of one and havent been able
to find any "assembler design" books.
I am currently written a assembler. I unfortunately started writing one
before finding this thread, and reading the short description on a macro
assembler. Also, this thread being quite dated - still does not mean it
could never be useful - I found it useful in the year 2006.
I ended up creating a table for the MODRM and SIB bytes at compile time,
using macros to generate the rather large table. The table used the format.
struct tmodrmsib_tbl
{
dword type;
sbyte *expression;
bool hasSIB;
byte modrm;
byte sib;
};
I generated every single possible addressing mode, and its corresponding
addressing entries. I used expression to hold a ASCII zero-terminated string
to store something like: "eax", "eax+ebx", "eax*4", "eax+ecx+$1".
I used: $1, $2, and $3. To represent a dword, word, and byte displacement.
My table ended up looking like:
{A_PTR, "eax", false, 0x00, 0x00},
{A_PTR, "ecx", false, 0x01, 0x00},
{A_PTR, "edx", false, 0x02, 0x00},
{A_PTR, "ebx", false, 0x03, 0x00},
MASIB3(A_PTR, , 0x04), // generate all SIB possibilities - no
displacement.
{A_PTR, "$3", false, 0x05, 0x00},
{A_PTR, "esi", false, 0x06, 0x00},
{A_PTR, "edi", false, 0x07, 0x00},
For the first addressing mode. I used a macro to generate the SIB entries.
I store instructions with:
struct tISet
{
dword memonic;
dword prefixs;
word opcode;
dword operand1;
dword operand2;
};
So, it looks like this:
tISet ISet[] = {
{0xFFFFFFFF, 0, 0, 0, 0},
{ME_MOV,0, 0x0088, A_RM8, A_R8 | X86_O_R},
{ME_MOV,P_OSO, 0x0089, A_RM16, A_R16 | X86_O_R},
{ME_MOV,0, 0x0089, A_RM32, A_R32 | X86_O_R},
{ME_MOV,0, 0x008A, A_R8 | X86_O_R, A_RM8},
{ME_MOV,P_OSO, 0x008B, A_R16 | X86_O_R, A_RM16},
{ME_MOV,0, 0x008B, A_R32 | X86_O_R, A_RM32},
};
I define my flags so that:
A_RM32 = A_R32 | A_DWORDPTR .. and so on. So multiple types can be specified
and pass for one type specified, when the assembler chooses the correct
instruction. X86_O_R is ignored by the type checking, and is later handled
by a function for writing out the arguments for the instruction.
I used this, passed around between my functions to keep track of the
instruction building process:
struct tipi
{
bool wrotePrefix;
dword prefix;
bool wroteOpcode;
word opcode;
bool wroteMODRM;
byte modrm;
bool wroteSIB;
byte sib;
byte wroteDisplacement;
union{
dword displacement;
sdword sdisplacement;
};
byte wroteIntermediate;
dword intermediate;
};
The final step is reading this struct and writing out the bytes for the
instruction. So, I do not think I built a macro assembler at all, but rather
something else that so far this design of the assembler has worked very
well.
I am planning on packing the just the core of generating the x86
instructions into this layer of the assembler, and the rest into a
preprocessor layer for the assembler I suppose? =)
http://compilers.iecc.com/comparch/article/98-06-126
Return to the
comp.compilers page.
Search the
comp.compilers archives again.