[LLVMdev] Folding vector instructions

Mon Dec 29 18:29:19 PST 2008

Hello.

Sorry I am not sure this question should go to llvm or mesa3d-dev mailing
list, so I post it to both.

I am writing a llvm backend for a modern graphics processor which has a ISA
very similar to that of Direct 3D.

I am reading the code in Gallium-3D driver in a mesa3d branch, which
converts the shader programs (TGSI tokens) to LLVM IR.

For the shader instruction also found in LLVM IR, the conversion is trivial:

<code>
llvm::Value * Instructions::mul(llvm::Value *in1, llvm::Value *in2) {
   return m_builder.CreateMul(in1, in2, name("mul")); // m_builder is a
llvm::IRBuilder
}
</code>

However, the special instrucions cannot directly be mapped to LLVM IR, like
"min", the conversion involves in 'extract' the vector, create
less-than-compare, create 'select' instruction, and create 'insert-element'
instruction.

<code>
llvm::Value * Instructions::min(llvm::Value *in1, llvm::Value *in2)
{
   std::vector<llvm::Value*> vec1 = extractVector(in1); // generate LLVM
extract element
   std::vector<llvm::Value*> vec2 = extractVector(in2);

   Value *xcmp  = m_builder.CreateFCmpOLT(vec1[0], vec2[0], name("xcmp"));
   Value *selx = m_builder.CreateSelect(xcmp, vec1[0], vec2[0],
                                        name("selx"));

   Value *ycmp  = m_builder.CreateFCmpOLT(vec1[1], vec2[1], name("ycmp"));
   Value *sely = m_builder.CreateSelect(ycmp, vec1[1], vec2[1],
                                        name("sely"));

   Value *zcmp  = m_builder.CreateFCmpOLT(vec1[2], vec2[2], name("zcmp"));
   Value *selz = m_builder.CreateSelect(zcmp, vec1[2], vec2[2],
                                        name("selz"));

   Value *wcmp  = m_builder.CreateFCmpOLT(vec1[3], vec2[3], name("wcmp"));
   Value *selw = m_builder.CreateSelect(wcmp, vec1[3], vec2[3],
                                        name("selw"));
   return vectorFromVals(selx, sely, selz, selw); // generate LLVM
'insert-element'
}
</code>

Eventually all these should be folded to a 'min' instruction in the codegen,
so I wonder if the conversion only generates a simple 'call' instruction to
a 'min Function' will make the instruction selection easier (no folding and
complicated pattern-matching in the instruction selection DAG).

I don't have experience of the new vector instructions in LLVM, and perhaps
that's why it makes me feel it's complicated to fold the swizzle and
writemask.

Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20081230/3e9eaef5/attachment.html>