[llvm] r315307 - [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors
David Stuttard via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 10 05:45:46 PDT 2017
Author: dstuttard
Date: Tue Oct 10 05:45:45 2017
New Revision: 315307
URL: http://llvm.org/viewvc/llvm-project?rev=315307&view=rev
Log:
[DAGCombine] Fix for shuffle to vector extend for non power 2 vectors
Summary:
See https://llvm.org/PR33743 for more details
It seems that for non-power of 2 vector sizes, the algorithm can produce
non-matching sizes for input and result causing an assert.
This usually isn't a problem as the isAnyExtend check will weed these out, but
in some cases (most often with lots of undefined values for the mask indices) it
can pass this check for non power of 2 vectors.
Adding in an extra check that ensures that bit size will match for the result
and input (as required)
Subscribers: nhaehnle
Differential Revision: https://reviews.llvm.org/D35241
Added:
llvm/trunk/test/CodeGen/AMDGPU/dagcomb-shuffle-vecextend-non2.ll
Modified:
llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=315307&r1=315306&r2=315307&view=diff
==============================================================================
--- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original)
+++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Tue Oct 10 05:45:45 2017
@@ -15566,6 +15566,9 @@ static SDValue combineShuffleToVectorExt
// Attempt to match a '*_extend_vector_inreg' shuffle, we just search for
// power-of-2 extensions as they are the most likely.
for (unsigned Scale = 2; Scale < NumElts; Scale *= 2) {
+ // Check for non power of 2 vector sizes
+ if (NumElts % Scale != 0)
+ continue;
if (!isAnyExtend(Scale))
continue;
Added: llvm/trunk/test/CodeGen/AMDGPU/dagcomb-shuffle-vecextend-non2.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/dagcomb-shuffle-vecextend-non2.ll?rev=315307&view=auto
==============================================================================
--- llvm/trunk/test/CodeGen/AMDGPU/dagcomb-shuffle-vecextend-non2.ll (added)
+++ llvm/trunk/test/CodeGen/AMDGPU/dagcomb-shuffle-vecextend-non2.ll Tue Oct 10 05:45:45 2017
@@ -0,0 +1,32 @@
+; RUN: llc -march=amdgcn -verify-machineinstrs < %s | FileCheck -check-prefix=GCN %s
+
+; We are only checking that instruction selection can succeed in this case. This
+; cut down test results in no instructions, but that's fine.
+;
+; See https://llvm.org/PR33743 for details of the bug being addressed
+;
+; Checking that shufflevector with 3-vec mask is handled in
+; combineShuffleToVectorExtend
+;
+; GCN: s_endpgm
+
+define amdgpu_ps void @main(i32 %in1) local_unnamed_addr {
+.entry:
+ br i1 undef, label %bb12, label %bb
+
+bb:
+ %__llpc_global_proxy_r5.12.vec.insert = insertelement <4 x i32> undef, i32 %in1, i32 3
+ %tmp3 = shufflevector <4 x i32> %__llpc_global_proxy_r5.12.vec.insert, <4 x i32> undef, <3 x i32> <i32 undef, i32 undef, i32 1>
+ %tmp4 = bitcast <3 x i32> %tmp3 to <3 x float>
+ %a2.i123 = extractelement <3 x float> %tmp4, i32 2
+ %tmp5 = bitcast float %a2.i123 to i32
+ %__llpc_global_proxy_r2.0.vec.insert196 = insertelement <4 x i32> undef, i32 %tmp5, i32 0
+ br label %bb12
+
+bb12:
+ %__llpc_global_proxy_r2.0 = phi <4 x i32> [ %__llpc_global_proxy_r2.0.vec.insert196, %bb ], [ undef, %.entry ]
+ %tmp6 = shufflevector <4 x i32> %__llpc_global_proxy_r2.0, <4 x i32> undef, <3 x i32> <i32 1, i32 2, i32 3>
+ %tmp7 = bitcast <3 x i32> %tmp6 to <3 x float>
+ %a0.i = extractelement <3 x float> %tmp7, i32 0
+ ret void
+}
More information about the llvm-commits
mailing list