[llvm] 7c9a89f - [X86] Teach combineVectorShiftImm to constant fold undef elements to 0 not undef.
Craig Topper via llvm-commits
llvm-commits at lists.llvm.org
Fri Jun 5 11:30:03 PDT 2020
Author: Craig Topper
Date: 2020-06-05T11:29:55-07:00
New Revision: 7c9a89fed8f5d53d61fe3a61a2581a7c28b1b6d2
URL: https://github.com/llvm/llvm-project/commit/7c9a89fed8f5d53d61fe3a61a2581a7c28b1b6d2
DIFF: https://github.com/llvm/llvm-project/commit/7c9a89fed8f5d53d61fe3a61a2581a7c28b1b6d2.diff
LOG: [X86] Teach combineVectorShiftImm to constant fold undef elements to 0 not undef.
Shifts are supposed to always shift in zeros or sign bits regardless of their inputs. It's possible the input value may have been replaced with undef by SimplifyDemandedBits, but the shift in zeros are still demanded.
This issue was reported to me by ispc from 10.0. Unfortunately their failing test does not fail on trunk. Seems to be because the shl is optimized out earlier now and doesn't become VSHLI.
ispc bug https://github.com/ispc/ispc/issues/1771
Differential Revision: https://reviews.llvm.org/D81212
Added:
Modified:
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/vec_shift5.ll
Removed:
################################################################################
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 0157b7db8a3f..306b90671d51 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -41439,14 +41439,22 @@ static SDValue combineVectorShiftImm(SDNode *N, SelectionDAG &DAG,
getTargetConstantBitsFromNode(N0, NumBitsPerElt, UndefElts, EltBits)) {
assert(EltBits.size() == VT.getVectorNumElements() &&
"Unexpected shift value type");
- for (APInt &Elt : EltBits) {
- if (X86ISD::VSHLI == Opcode)
+ // Undef elements need to fold to 0. It's possible SimplifyDemandedBits
+ // created an undef input due to no input bits being demanded, but user
+ // still expects 0 in other bits.
+ for (unsigned i = 0, e = EltBits.size(); i != e; ++i) {
+ APInt &Elt = EltBits[i];
+ if (UndefElts[i])
+ Elt = 0;
+ else if (X86ISD::VSHLI == Opcode)
Elt <<= ShiftVal;
else if (X86ISD::VSRAI == Opcode)
Elt.ashrInPlace(ShiftVal);
else
Elt.lshrInPlace(ShiftVal);
}
+ // Reset undef elements since they were zeroed above.
+ UndefElts = 0;
return getConstVector(EltBits, UndefElts, VT.getSimpleVT(), DAG, SDLoc(N));
}
diff --git a/llvm/test/CodeGen/X86/vec_shift5.ll b/llvm/test/CodeGen/X86/vec_shift5.ll
index 873de4b08349..5c84d7c748f0 100644
--- a/llvm/test/CodeGen/X86/vec_shift5.ll
+++ b/llvm/test/CodeGen/X86/vec_shift5.ll
@@ -149,7 +149,7 @@ define <4 x i32> @test10() {
define <2 x i64> @test11() {
; X32-LABEL: test11:
; X32: # %bb.0:
-; X32-NEXT: movaps {{.*#+}} xmm0 = <u,u,3,0>
+; X32-NEXT: movaps {{.*#+}} xmm0 = [0,0,3,0]
; X32-NEXT: retl
;
; X64-LABEL: test11:
@@ -219,7 +219,7 @@ define <4 x i32> @test15() {
define <2 x i64> @test16() {
; X32-LABEL: test16:
; X32: # %bb.0:
-; X32-NEXT: movaps {{.*#+}} xmm0 = <u,u,248,0>
+; X32-NEXT: movaps {{.*#+}} xmm0 = [0,0,248,0]
; X32-NEXT: retl
;
; X64-LABEL: test16:
More information about the llvm-commits
mailing list