[LLVMdev] [Mesa3d-dev] Folding vector instructions

Tue Dec 30 06:39:56 PST 2008

Alex wrote:
> Hello.
> 
> Sorry I am not sure this question should go to llvm or mesa3d-dev mailing
> list, so I post it to both.
> 
> I am writing a llvm backend for a modern graphics processor which has a ISA
> very similar to that of Direct 3D.
> 
> I am reading the code in Gallium-3D driver in a mesa3d branch, which
> converts the shader programs (TGSI tokens) to LLVM IR.
> 
> For the shader instruction also found in LLVM IR, the conversion is trivial:
> 
> <code>
> llvm::Value * Instructions::mul(llvm::Value *in1, llvm::Value *in2) {
>    return m_builder.CreateMul(in1, in2, name("mul")); // m_builder is a
> llvm::IRBuilder
> }
> </code>
> 
> However, the special instrucions cannot directly be mapped to LLVM IR, like
> "min", the conversion involves in 'extract' the vector, create
> less-than-compare, create 'select' instruction, and create 'insert-element'
> instruction.
> 
> <code>
> llvm::Value * Instructions::min(llvm::Value *in1, llvm::Value *in2)
> {
>    std::vector<llvm::Value*> vec1 = extractVector(in1); // generate LLVM
> extract element
>    std::vector<llvm::Value*> vec2 = extractVector(in2);
> 
>    Value *xcmp  = m_builder.CreateFCmpOLT(vec1[0], vec2[0], name("xcmp"));
>    Value *selx = m_builder.CreateSelect(xcmp, vec1[0], vec2[0],
>                                         name("selx"));
> 
>    Value *ycmp  = m_builder.CreateFCmpOLT(vec1[1], vec2[1], name("ycmp"));
>    Value *sely = m_builder.CreateSelect(ycmp, vec1[1], vec2[1],
>                                         name("sely"));
> 
>    Value *zcmp  = m_builder.CreateFCmpOLT(vec1[2], vec2[2], name("zcmp"));
>    Value *selz = m_builder.CreateSelect(zcmp, vec1[2], vec2[2],
>                                         name("selz"));
> 
>    Value *wcmp  = m_builder.CreateFCmpOLT(vec1[3], vec2[3], name("wcmp"));
>    Value *selw = m_builder.CreateSelect(wcmp, vec1[3], vec2[3],
>                                         name("selw"));
>    return vectorFromVals(selx, sely, selz, selw); // generate LLVM
> 'insert-element'
> }
> </code>
> 
> Eventually all these should be folded to a 'min' instruction in the codegen,
> so I wonder if the conversion only generates a simple 'call' instruction to
> a 'min Function' will make the instruction selection easier (no folding and
> complicated pattern-matching in the instruction selection DAG).
> 
> I don't have experience of the new vector instructions in LLVM, and perhaps
> that's why it makes me feel it's complicated to fold the swizzle and
> writemask.
> 
> Thanks.

I hope marcheu sees this too.

Um, I was thinking that we should eventually create intrinsic functions
for some of the commands, like LIT, that might not be
single-instruction, but that can be lowered eventually, and for commands
like LG2, that might be single-instruction for shaders, but probably not
for non-shader chipsets.

Unfortunately, I'm still learning LLVM, so I might be completely and
totally off-base here.

Out of curiosity, which chipset are you working on? R600? NV50?
Something else?

~ C.