[llvm] 642eed3 - [x86] fix miscompile in buildvector v16i8 lowering
Sanjay Patel via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 7 10:02:40 PDT 2020
Author: Sanjay Patel
Date: 2020-07-07T13:02:31-04:00
New Revision: 642eed37134db4aca953704d1e4ae856af675f51
URL: https://github.com/llvm/llvm-project/commit/642eed37134db4aca953704d1e4ae856af675f51
DIFF: https://github.com/llvm/llvm-project/commit/642eed37134db4aca953704d1e4ae856af675f51.diff
LOG: [x86] fix miscompile in buildvector v16i8 lowering
In the test based on PR46586:
https://bugs.llvm.org/show_bug.cgi?id=46586
...we are inserting 16-bits into the high element of the vector, shuffling it
to element 0, and extracting 32-bits. But xmm1 was never initialized, so the
top 16-bits of the extract are undef without this patch.
(It seems like we could do better than this by recognizing that we only demand
a subsection of the build vector, but I want to make sure we fix the
miscompile 1st.)
This path is only used for pre-SSE4.1, and simpler patterns get squashed
somewhere along the way, so the test still includes a 'urem' as it did in the
original test from the bug report.
Differential Revision: https://reviews.llvm.org/D83319
Added:
Modified:
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/buildvec-insertvec.ll
Removed:
################################################################################
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 4821dd44e01f..575f358361b1 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -8002,10 +8002,11 @@ static SDValue LowerBuildVectorv16i8(SDValue Op, unsigned NonZeros,
Elt = NextElt;
}
- // If our first insertion is not the first index then insert into zero
- // vector to break any register dependency else use SCALAR_TO_VECTOR.
+ // If our first insertion is not the first index or zeros are needed, then
+ // insert into zero vector. Otherwise, use SCALAR_TO_VECTOR (leaves high
+ // elements undefined).
if (!V) {
- if (i != 0)
+ if (i != 0 || NumZero)
V = getZeroVector(MVT::v8i16, Subtarget, DAG, dl);
else {
V = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, MVT::v4i32, Elt);
diff --git a/llvm/test/CodeGen/X86/buildvec-insertvec.ll b/llvm/test/CodeGen/X86/buildvec-insertvec.ll
index 3922450b0f21..e428ae8d5919 100644
--- a/llvm/test/CodeGen/X86/buildvec-insertvec.ll
+++ b/llvm/test/CodeGen/X86/buildvec-insertvec.ll
@@ -784,12 +784,13 @@ define <4 x i32> @ossfuzz5688(i32 %a0) {
ret <4 x i32> %5
}
-; FIXME: If we do not define all bytes that are extracted, this is a miscompile.
+; If we do not define all bytes that are extracted, this is a miscompile.
define i32 @PR46586(i8* %p, <4 x i32> %v) {
; SSE2-LABEL: PR46586:
; SSE2: # %bb.0:
; SSE2-NEXT: movzbl 3(%rdi), %eax
+; SSE2-NEXT: pxor %xmm1, %xmm1
; SSE2-NEXT: pinsrw $6, %eax, %xmm1
; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[3,1,2,3]
; SSE2-NEXT: movd %xmm1, %eax
More information about the llvm-commits
mailing list