[PATCH] D17490: [InstCombine][SSE] Demanded vector elements for scalar intrinsics
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 19 06:37:59 PDT 2016
RKSimon added inline comments.
================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:1374-1389
@@ -1339,1 +1373,18 @@
+ case Intrinsic::x86_sse41_round_ss:
+ case Intrinsic::x86_sse41_round_sd: {
+ // These intrinsics demand the upper elements of the first input vector and
+ // the lowest element of the second input vector.
+ bool MadeChange = false;
+ Value *Arg0 = II->getArgOperand(0);
+ Value *Arg1 = II->getArgOperand(1);
+ unsigned VWidth = Arg0->getType()->getVectorNumElements();
+ if (Value *V = SimplifyDemandedVectorEltsHigh(Arg0, VWidth, VWidth - 1)) {
+ II->setArgOperand(0, V);
+ MadeChange = true;
+ }
+ if (Value *V = SimplifyDemandedVectorEltsLow(Arg1, VWidth, 1)) {
+ II->setArgOperand(1, V);
+ MadeChange = true;
+ }
+ if (MadeChange)
----------------
For the binary scalar intrinsics we need all the elements of the first input: the lowest is used in the operation and the remaining are all 'passed through' to the result.
================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:1406-1409
@@ -1350,6 +1405,6 @@
case Intrinsic::x86_avx2_psrli_d:
case Intrinsic::x86_avx2_psrli_q:
case Intrinsic::x86_avx2_psrli_w:
case Intrinsic::x86_sse2_pslli_d:
case Intrinsic::x86_sse2_pslli_q:
case Intrinsic::x86_sse2_pslli_w:
----------------
I've included the 'MadeChange' bool so we can combine both in one pass. I've also done this in the existing similar COMI/UCOMI combine above.
Repository:
rL LLVM
http://reviews.llvm.org/D17490
More information about the llvm-commits
mailing list