Re: x86 Instruction Scheduling

Vladimir Makarov <vmakarov@redhat.com>
5 Jun 2003 23:15:59 -0400

          From comp.compilers

Related articles
x86 Instruction Scheduling nicolas_capens@hotmail.com (2003-06-03)
Re: x86 Instruction Scheduling vmakarov@redhat.com (Vladimir Makarov) (2003-06-05)
Re: x86 Instruction Scheduling qed@pobox.com (2003-06-20)
| List of all articles for this month |
From: Vladimir Makarov <vmakarov@redhat.com>
Newsgroups: comp.compilers
Date: 5 Jun 2003 23:15:59 -0400
Organization: Red Hat, Inc.
References: 03-06-020
Keywords: 386, optimize
Posted-Date: 05 Jun 2003 23:15:59 EDT

c0d1f1ed wrote:
> I have a project swShader (http://sw-shader.sourceforge.net) which
> could benefit a lot from instruction scheduling. Basically it's a
> compiler for DirectX 9 pixel shader instructions. Every shader
> instruction gets substituted by MMX/SSE instructions, and register
> allocation is handled automatically. Because of this substitution,
> dependent instructions are close together, causing inefficiencies. The
> ps 2.0 specifications do not have jumps or calls, so I often have more
> than a hundred instructions that could be scheduled as one block.
>
> So what kind of fast heuristics should 'always work'? I'm also looking
> for more than one heuristic, or a very configurable one so the optimal
> can be manually selected after benchmarking so it's more or less
> system independent. Because of the out-of-order execution and many
> unknown parameters I think it's hard to solve this perfectly, but what
> is considered a good method for MMX/SSE instruction sheduling?


It is still important to have insn scheduling for out-of-order
speculative execution processors for big basic blocks and loops
because of constraints of processor register renaming buffer and
because constraints of number of branches which the processor goes
trough for speculative execution.


Optimal solution of insn scheduling is NP-complete problem. Therefore
heuristic algorithms are used. So usually two major used heuristics
are critical path length or issuing as many insns as possible on a
processor cycle (see Steven Muchnick's book). There are approaches
combining the two major heuristics (see the first cycle multipass insn
scheduling in article "The finite state automaton based pipeline
hazard recognizer and instruction scheduler in Gcc" in
http://www.linux.org.uk/~ajh/gcc/gccsummit-2003-proceedings.pdf ).


In your case, the best heuristic is critical path length because P4
can issue only 1 MMX/SSE insn per cycle and such insns have long
latency time. So I think a list basic block insn scheduling based on
critical path length heuristic is the most reasonable approach for
you.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.