optimizing an asm-like language

"Remo D." <rdentato@news.tin.it>
Tue, 29 Jan 2008 22:35:15 +0100

          From comp.compilers

Related articles
optimizing an asm-like language rdentato@news.tin.it (Remo D.) (2008-01-29)
Re: optimizing an asm-like language cfc@shell01.TheWorld.com (Chris F Clark) (2008-01-30)
Re: optimizing an asm-like language rdentato@news.tin.it (Remo D.) (2008-02-03)
Re: optimizing an asm-like language walter@bytecraft.com (Walter Banks) (2008-02-04)
| List of all articles for this month |

From: "Remo D." <rdentato@news.tin.it>
Newsgroups: comp.compilers
Date: Tue, 29 Jan 2008 22:35:15 +0100
Organization: TIN.IT (http://www.tin.it)
Keywords: optimize, question
Posted-Date: 30 Jan 2008 00:41:49 EST

Hi there. I'm adding the capability to generate code to my regular
expression based tool (http://yrx.googlecode.com).

I've defined an assembly-like language to use as intermediate
code. The plan is to optimize this asm-like code and then convert it
in C (or possibly other languages).

The main reason to go through this intermediate language is to apply
optimization only once rather that having to devise optimization in
each target language.

The other reason is to be able to write a function to directly execute
this asm-like program.

The first question I have is: Should I use a different approach? I
briefly thought of interpreting the DFA but it didn't seem a good idea
to me. Also, I can foresee an issue on converting the ASM language to
another language that doesn't have goto's.

The second question is: are there any optimization techniques I should
absolutely know for optimizing asm-like code?

To give you an example, the code at the bottom of this post is
generated for the expression "[a-z]?n". The meaning of the
instructions should be clear enough, except, maybe, for Rxx
(conditionl returns; RNE = Return if Not Equal, RLT = Return if Less
Then, etc) and "MTC x" (MaTCh expression, signals that the expression
x has been matched).

The things I've thought so far are:
      - rearrange basic blocks to eliminate absolute jumps (e.g. the block
labeled S2, lines 24-27, could be moved to line 14 removing the "JMP S2")
      - eliminate dead code (line 15, line 21 and line 28 for example)
      - eliminate useless JMP (like the "JMP S3" at line 20)
Should I look for other optimizations?

Any comment will be very welcome.


    1 S1: GET
    2 L2: CMP 'n'
    3 JGT L3
    4 JEQ S4
    5 L1: CMP 'a'
    6 RLT
    7 CMP 'm'
    8 RGT
    9 JMP S2
10 L3: CMP 'o'
11 RLT
12 CMP 'z'
13 RGT
14 JMP S2
15 RET
16 S4: GET
17 MTC 1
18 L4: CMP 'n'
19 RNE
20 JMP S3
21 RET
22 S3: MTC 1
23 RET
24 S2: GET
25 L5: CMP 'n'
26 RNE
27 JMP S3
28 RET

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.