Transform vectorized instruction with "swizzle" and "writemask" to SSA forms

tzuchien.chiu@gmail.com (Tzu-Chien Chiu)
26 Apr 2005 20:41:35 -0400

          From comp.compilers

Related articles
Transform vectorized instruction with "swizzle" and "writemask" to SSA tzuchien.chiu@gmail.com (2005-04-26)
| List of all articles for this month |

From: tzuchien.chiu@gmail.com (Tzu-Chien Chiu)
Newsgroups: comp.compilers
Date: 26 Apr 2005 20:41:35 -0400
Organization: Compilers Central
Keywords: question, code
Posted-Date: 26 Apr 2005 20:41:35 EDT

Hello, everyone:


I am writing a compiler for a programmable graphics hardware. Each
registers of the hardware has four channels, namely 'r', 'b', 'g',
'a', and each channel is a 32-bit floating point. It's similar to the
high and low 8-bit of an x86 16-bit general purpose register "AX" can
be individually referenced as "AH" and "AL". What's different is the
hardware further "source register swizzle" and "writemask". For
example:


        # The following two instructions are equivalent.
        # They cost the same instruction slot, and have same
        # execution time. Four channels are added in parallel.
        add r0, r1, r2
        add r0.xyzw, r1.xyzw, r2.xyzw


        # equivalent to:
        # r0.x = r1.yy + r2.w
        # r0.z = r1.yy + r2.x
        # r0.y and r0.w remains unchanged
        add r0.xz, r1.y, r2.wx


Note that the channel y of r1 is replicated in the third instruction.


Detailed documentation:
<http://msdn.microsoft.com/library/default.asp?url=/library/en-us/directx9_c/directx/graphics/reference/AssemblyLanguageShaders/PixelShaders/Registers/Modifiers/SourceRegisterModifiers/PS_Swizzling.asp>


I want to transform the code to Static Single Assignment (SSA) form.


(1) If each channel of a register as a individual SSA variable.


This could generate inefficient machine code.


For example, the instruction:


    add r0.xz, r1.y, r2.wx


is translated to two SSA:


    r0_x = r1_y + r2_w
    r0_z = r1_y + r2_x


Subsequent optimization passes could insert other instructions between
these two add instructions (for example, in instruction scheduling
pass). I don't know how they could be easily merged back into one
instruction. It could lead to inefficient machine code (though
correct).


(2) Create new instructions, "swizzle" and "merge".


    # A swizzle instruction acts like a channel "selector",
    # selecting channels from the temporary registers r1 and r2.
    temp_0 = swizzle.yy( r1 )
    temp_1 = swizzle.wx( r2 )


    temp_3 = temp_0 + temp_1


    temp_4 = merge.xz( temp_3 )




Anyone knows if there is possbile and easier alternatives?



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.