[llvm-commits] [llvm] r76843 - in /llvm/trunk: lib/Target/X86/X86InstrSSE.td test/CodeGen/X86/vec_insertps-1.ll

Wed Jul 22 23:07:29 PDT 2009

Right. Also, please stick with the usual naming convention

+    def match_rr : SS4AIi8<opc, MRMSrcReg, (outs VR128:$dst),

This will add a INSERTPSmatch_rr instruction, which is rather strange.

In this case, your INSERTPSrr is correct, the previous one, which  
specifies a FR32 as source 2, is wrong.  The register form should take  
a 128-bit vector value. Please keep your INSERTPSrr instruction and  
pattern.

Next step is to fix X86TargetLowering::LowerINSERT_VECTOR_ELT_SSE4 in  
X86ISelLowering.cpp

This part is wrong since INSERTPSrr expects a vector source operand.

   } else if (EVT == MVT::f32 && isa<ConstantSDNode>(N2)) {
     // Bits [7:6] of the constant are the source select.  This will  
always be
     //  zero here.  The DAG Combiner may combine an extract_elt index  
into these
     //  bits.  For example (insert (extract, 3), 2) could be matched  
by putting
     //  the '3' into bits [7:6] of X86ISD::INSERTPS.
     // Bits [5:4] of the constant are the destination select.  This  
is the
     //  value of the incoming immediate.
     // Bits [3:0] of the constant are the zero mask.  The DAG  
Combiner may
     //   combine either bitwise AND or insert of float 0.0 to set  
these bits.
     N2 = DAG.getIntPtrConstant(cast<ConstantSDNode>(N2)->getZExtValue 
() << 4);
     return DAG.getNode(X86ISD::INSERTPS, dl, VT, N0, N1, N2);

I think you need something like "N1 = DAG.getNode 
(ISD::SCALAR_TO_VECTOR, VT, N1);" to force it into a vector value.

This part looks wrong too:

   } else if (EVT == MVT::i32) {
     // InsertPS works with constant index.
     if (isa<ConstantSDNode>(N2))
       return Op;
   }

I don't really understand this. I think it can be combined into the  
last else clause. That is, the last else should be just
} else if (isa<ConstantSDNode>(N2)) {
...
}

You should then add a def : Pat pattern to match to the INSERTPSrr  
instruction.

Of course, the above change means the pattern for the memory form of  
the instruction is also wrong:

     def match_rm : SS4AIi8<opc, MRMSrcMem, (outs VR128:$dst),
                    (ins VR128:$src1, f32mem:$src2, i32i8imm:$src3),
                    !strconcat(OpcodeStr,
                     "\t{$src3, $src2, $dst|$dst, $src2, $src3}"),
                    [(set VR128:$dst,
                      (X86insrtps VR128:$src1, (loadf32 addr:$src2),
                                  imm:$src3))]>, OpSize;

The pattern probably should be something like:
[(set VR128:$dst, (X86insrtps VR128:$src1, (v4f32 (scalar_to_vector  
(loadf32 addr:$src2)), imm:$src3))]

We probably need to add more def : Pat patterns to match the v4i32 as  
well. But let's handle that as second part.

Thanks,

Evan

On Jul 22, 2009, at 7:32 PM, Eli Friedman wrote:

> On Wed, Jul 22, 2009 at 7:23 PM, Eric  
> Christopher<echristo at apple.com> wrote:
>> +let Constraints = "$src1 = $dst" in {
>> +  def INSERTPSrr : SS4AIi8<0x21, MRMSrcReg, (outs VR128:$dst),
>> +                          (ins VR128:$src1, VR128:$src2, i32i8imm: 
>> $src3),
>> +                          "insertps\t{$src3, $src2, $dst|$dst,  
>> $src2, $src3}",
>> +                    [(set VR128:$dst, (int_x86_sse41_insertps  
>> VR128:$src1,
>> +                                                    VR128:$src2,  
>> imm:$src3))]>;
>> +}
>> +
>
> Don't duplicate instructions; you can use patterns to map multiple
> ways of expressing the same construct.
>
> -Eli
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits