[llvm-commits] [llvm] r76843 - in /llvm/trunk: lib/Target/X86/X86InstrSSE.td test/CodeGen/X86/vec_insertps-1.ll
Evan Cheng
evan.cheng at apple.com
Wed Jul 22 23:07:29 PDT 2009
Right. Also, please stick with the usual naming convention
+ def match_rr : SS4AIi8<opc, MRMSrcReg, (outs VR128:$dst),
This will add a INSERTPSmatch_rr instruction, which is rather strange.
In this case, your INSERTPSrr is correct, the previous one, which
specifies a FR32 as source 2, is wrong. The register form should take
a 128-bit vector value. Please keep your INSERTPSrr instruction and
pattern.
Next step is to fix X86TargetLowering::LowerINSERT_VECTOR_ELT_SSE4 in
X86ISelLowering.cpp
This part is wrong since INSERTPSrr expects a vector source operand.
} else if (EVT == MVT::f32 && isa<ConstantSDNode>(N2)) {
// Bits [7:6] of the constant are the source select. This will
always be
// zero here. The DAG Combiner may combine an extract_elt index
into these
// bits. For example (insert (extract, 3), 2) could be matched
by putting
// the '3' into bits [7:6] of X86ISD::INSERTPS.
// Bits [5:4] of the constant are the destination select. This
is the
// value of the incoming immediate.
// Bits [3:0] of the constant are the zero mask. The DAG
Combiner may
// combine either bitwise AND or insert of float 0.0 to set
these bits.
N2 = DAG.getIntPtrConstant(cast<ConstantSDNode>(N2)->getZExtValue
() << 4);
return DAG.getNode(X86ISD::INSERTPS, dl, VT, N0, N1, N2);
I think you need something like "N1 = DAG.getNode
(ISD::SCALAR_TO_VECTOR, VT, N1);" to force it into a vector value.
This part looks wrong too:
} else if (EVT == MVT::i32) {
// InsertPS works with constant index.
if (isa<ConstantSDNode>(N2))
return Op;
}
I don't really understand this. I think it can be combined into the
last else clause. That is, the last else should be just
} else if (isa<ConstantSDNode>(N2)) {
...
}
You should then add a def : Pat pattern to match to the INSERTPSrr
instruction.
Of course, the above change means the pattern for the memory form of
the instruction is also wrong:
def match_rm : SS4AIi8<opc, MRMSrcMem, (outs VR128:$dst),
(ins VR128:$src1, f32mem:$src2, i32i8imm:$src3),
!strconcat(OpcodeStr,
"\t{$src3, $src2, $dst|$dst, $src2, $src3}"),
[(set VR128:$dst,
(X86insrtps VR128:$src1, (loadf32 addr:$src2),
imm:$src3))]>, OpSize;
The pattern probably should be something like:
[(set VR128:$dst, (X86insrtps VR128:$src1, (v4f32 (scalar_to_vector
(loadf32 addr:$src2)), imm:$src3))]
We probably need to add more def : Pat patterns to match the v4i32 as
well. But let's handle that as second part.
Thanks,
Evan
On Jul 22, 2009, at 7:32 PM, Eli Friedman wrote:
> On Wed, Jul 22, 2009 at 7:23 PM, Eric
> Christopher<echristo at apple.com> wrote:
>> +let Constraints = "$src1 = $dst" in {
>> + def INSERTPSrr : SS4AIi8<0x21, MRMSrcReg, (outs VR128:$dst),
>> + (ins VR128:$src1, VR128:$src2, i32i8imm:
>> $src3),
>> + "insertps\t{$src3, $src2, $dst|$dst,
>> $src2, $src3}",
>> + [(set VR128:$dst, (int_x86_sse41_insertps
>> VR128:$src1,
>> + VR128:$src2,
>> imm:$src3))]>;
>> +}
>> +
>
> Don't duplicate instructions; you can use patterns to map multiple
> ways of expressing the same construct.
>
> -Eli
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list