[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)

Fri Jul 29 00:48:49 PDT 2005

Actually the problems that Tzu-Chien Chiu are encountering are similar to what should be done for generating SSE code in 
the X86 backend and also other SIMD instruction sets. I think LLVM neeeds to add instructions for permuting components, 
extracting and injecting elements in packed types. If the architecture has instructions which can do permutations for 
each instruction (for example 'add' with permutation) it should be the role of the pattern instruction selector to 
recognise the shuffle+add combination and emit a single instruction.

m.

Tzu-Chien Chiu wrote:
> Each register is a 4-component (namely, r, g, b, a) vector register. 
> They are actually defined as llvm packed [4xfloat].
> 
> The instruction:
> 
>   add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz
> 
> Explaination:
> 
> '.a' is a writemask. only the specified component will be update
> 
> '.xxyy' and '.zzzz' are swizzle masks, specify the component
> permutation, simliar to the Intel SSE permutation instruction SHUFPD
> 
> '_bias' and '_x2' are modifiers. they modify the value of source
> operands and send the modified values to the adder. '_bias' = source -
> 0.5, '_x2' = source * 2
> 
> '_sat' is an instruction modifier. when specified, it saturates (or
> clamps) the instruction result to the range [0, 1] before writing to
> the destination register.
> 
> All of these 'writemask', 'swizzle', 'source modifier', and
> 'instruction modifiers' are optionally specified.
> 
> How should I define the instruction in a TableGen .td file?
> 
> I have two alternatives:
> 
> 1. 
>   class WriteMask : Operand<i8> {}
>   def WM : WriteMask;
> 
>   class Swizzle : Operand<8> {}
>   def SW: Swizzle;
> 
>   class InstructionModifier : Operand<i8> {}
>   def IM: InstructionModifier ;
>   
>   class SourceModifier : Operand<i8> {}
>   def SM: SourceModifier ;
> 
>   def ADD<0x01, (ops 
>     GPR:$dest, ops WM:$wm, IM:$im, 
>     GPR:$src0, SW:$sw0, SM:$sm0,
>     GPR:$src1, SW:$sw1 SM:$sm1 ), ... >
> 
> 2. add llvm intrinsics:
> 
>   ; add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz
>   r1_1 = llvm.bias( r1_0 )
>   r1_2 = llvm.shuffle( xxyy )
>   r3_1 = llvm.x2( r3_0 )
>   r3_2 = llvm.shuffle( zzzz )
>   r0_0 = add r1_2, r3_2
>   r0_1 = llvm.sature( r0_0 )
>   r0_2 = llvm.select( a )
> 
> but it makes the implementing the instruction selector very diffifult.
> in this example, llvm.select() and llvm.sature() are encountered frist
> (bootm-up), but they must be 'remembered' and the instruction cannot
> be generated (BuildMI) until the opcode is known.
> 
> Which one should I do?
>