[PATCH] D12680: [InstCombine] Added vector demanded bits support for SSE4A EXTRQ/INSERTQ instructions
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 15 02:54:20 PDT 2015
RKSimon added a comment.
I'll get this updated - thanks for looking at this, I know SSE4A isn't the most exciting of instructions to deal with!!
================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:934-935
@@ +933,4 @@
+ // operands and the lowest 16-bits of the second.
+ auto Op0 = II->getArgOperand(0);
+ auto Op1 = II->getArgOperand(1);
+ unsigned VWidth0 = cast<VectorType>(Op0->getType())->getNumElements();
----------------
spatel wrote:
> Use 'auto *' for pointers? Same comment for the other autos below.
No problem.
================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:972-986
@@ +971,17 @@
+
+ case Intrinsic::x86_sse4a_insertq: {
+ // INSERTQ uses only the lowest 64-bits of the first 128-bit vector
+ // operand.
+ auto Op = II->getArgOperand(0);
+ unsigned VWidth = cast<VectorType>(Op->getType())->getNumElements();
+ assert(VWidth == 2 && "Unexpected operand size");
+
+ APInt DemandedElts = APInt::getLowBitsSet(VWidth, 1);
+ APInt UndefElts(VWidth, 0);
+ if (Value *V = SimplifyDemandedVectorElts(Op, DemandedElts, UndefElts)) {
+ II->setArgOperand(0, V);
+ return II;
+ }
+ break;
+ }
+
----------------
spatel wrote:
> Isn't this identical to the extrqi case? If yes, combine cases and eliminate the duplicated code. Could also just add a 3-4 line helper function that can be used for all of the cases?
I'll look into creating a helper (this is one of the most common patterns in this entire file....). I'm intending to add support for decoding EXTRQI to shuffles soon so wish to keep the actual cases separate.
Repository:
rL LLVM
http://reviews.llvm.org/D12680
More information about the llvm-commits
mailing list