Re: CISC to RISC translator? (Roedy Green)
Thu, 18 Aug 1994 15:13:00 GMT

          From comp.compilers

Related articles
CISC to RISC translator? (1994-08-16)
Re: CISC to RISC translator? (1994-08-18)
Re: CISC to RISC translator? (1994-08-18)
Re: CISC to RISC translator? (1994-08-18)
Re: CISC to RISC translator? (1994-08-19)
Re: CISC to RISC translator? (1994-08-24)
| List of all articles for this month |

Newsgroups: comp.compilers
From: (Roedy Green)
Keywords: architecture, translator
Organization: Canadian Mind Products
References: 94-08-099
Date: Thu, 18 Aug 1994 15:13:00 GMT

Difficulties in writing CISC->RISC translator

1. Loss of information on intent.
When you convert from MASM to machine code, or high level to machine code,
information about the programmer's intent is lost. For example in the
Intel 80x86 line:
    MOV BX, 9999

generate identical machine code if PlaceInRAM just happens by accident to
be 9999.
In translating such code, you can't tell which was intended. In your
translated version, PlaceInRAM will almost certainly not have the same
accidental address.

2. Determining linear code fragments:
If you could break the code up in to fragments, and know that no one could
jump into the middle of the the fragment, you are fairly free to optimise
to your hearts content. All you need do is reasonably simulate the state
of the original machine at both ends of the code fragment. In machine
code, there is nothing to specially mark such entry points. The only way
you could find them is to watch the program run for several hours, then add
to that by tracing all possible flow paths to pick up error/failure paths
that did occur. Some paths can't be found in theory, for example a jump
table, since you don't know ahead of time the actual range of index that
might come down the pipe. Further, to make it really intractable, some
jump tables are dynamically constructed by any conceivable algorithm.

3. too much state information:
CISC machines tend to set many condition code bits on every instruction,
whether the results will ever be used or not. RISC does not. Such a
translator has to decide which of the bubblegum to bother simulating. 99%
of it is not really necessary, but how to be sure?

A more promising approach is to use a P-CODE, designed to be easy to
translate both into native CISC or native RISC architectures. It would be
like a super CISC instruction set, with extra baggage to aid in the
translation. It could then be translated/compiled as needed at load time.
This would give the advantage of compact, fast loading executables that
could run on many platforms. My making the P-CODE set very CISCy it gives
plenty of room for RISC to optimise WITHIN each instruction.

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.