[PATCH] D32620: [DAGCombiner] shrink/widen a vselect to match its condition operand size (PR14657)
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 28 09:38:22 PDT 2017
spatel added inline comments.
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6945
+ // FP_ROUND (fptrunc) has an extra operand.
+ SDValue CastOp1 = DAG.getNode(CastOpcode, DL, VT, SelOp1, N->getOperand(1));
+ SDValue CastOp2 = DAG.getNode(CastOpcode, DL, VT, SelOp2, N->getOperand(1));
----------------
nadav wrote:
> I think that we should check that a and b have one use.
I could not find a reason for this. Apologies for moving the goal posts, but I renamed the variables in the updated version of the patch in an attempt to make the code clearer.
Your reference to 'a and b' was to the setcc operands. These are not in play with this transform. Ie, the setcc remains as-is regardless of what happens to the vselect.
'c and d' were the vselect operands (these are now 'A and B'), but I don't see a benefit to checking the number of uses on those either.
Here are a couple of hacked up tests to show these cases. I can add these to the patch if that would help.
define <4 x double> @fpext2(<4 x double> %x, <4 x double> %y, <4 x float> %a, <4 x float> %b) {
%cmp = fcmp olt <4 x double> %x, %y
%sel = select <4 x i1> %cmp, <4 x float> %a, <4 x float> %b
%ext = fpext <4 x float> %sel to <4 x double>
%add = fadd <4 x double> %x, %y <--- extra uses of fcmp operands
%div = fdiv <4 x double> %ext, %add
ret <4 x double> %div
}
With this patch (no one-use check on the compare operands):
vcmpltpd %ymm1, %ymm0, %ymm4
vcvtps2pd %xmm2, %ymm2
vcvtps2pd %xmm3, %ymm3
vblendvpd %ymm4, %ymm2, %ymm3, %ymm2
vaddpd %ymm1, %ymm0, %ymm0
vdivpd %ymm0, %ymm2, %ymm0
If we bail out on hasOneUse():
vcmpltpd %ymm1, %ymm0, %ymm4
vextractf128 $1, %ymm4, %xmm5
vpacksswb %xmm5, %xmm4, %xmm4
vblendvps %xmm4, %xmm2, %xmm3, %xmm2
vcvtps2pd %xmm2, %ymm2
vaddpd %ymm1, %ymm0, %ymm0
vdivpd %ymm0, %ymm2, %ymm0
And the case where the select ops have >1 use:
define <4 x double> @fpext3(<4 x double> %x, <4 x double> %y, <4 x float> %a, <4 x float> %b) {
%cmp = fcmp olt <4 x double> %x, %y
%sel = select <4 x i1> %cmp, <4 x float> %a, <4 x float> %b
%ext = fpext <4 x float> %sel to <4 x double>
%add = fadd <4 x float> %a, %b <--- extra uses of select ops
%ext2 = fpext <4 x float> %add to <4 x double>
%div = fdiv <4 x double> %ext, %ext2
ret <4 x double> %div
}
Transformed:
vcmpltpd %ymm1, %ymm0, %ymm0
vcvtps2pd %xmm2, %ymm1
vcvtps2pd %xmm3, %ymm4
vblendvpd %ymm0, %ymm1, %ymm4, %ymm0
vaddps %xmm3, %xmm2, %xmm1
vcvtps2pd %xmm1, %ymm1
vdivpd %ymm1, %ymm0, %ymm0
Or bail out:
vcmpltpd %ymm1, %ymm0, %ymm0
vextractf128 $1, %ymm0, %xmm1
vpacksswb %xmm1, %xmm0, %xmm0
vblendvps %xmm0, %xmm2, %xmm3, %xmm0
vcvtps2pd %xmm0, %ymm0
vaddps %xmm3, %xmm2, %xmm1
vcvtps2pd %xmm1, %ymm1
vdivpd %ymm1, %ymm0, %ymm0
https://reviews.llvm.org/D32620
More information about the llvm-commits
mailing list