[llvm] r259796 - [X86][SSE] Add general 32-bit LOAD + VZEXT_MOVL support to EltsFromConsecutiveLoads

Fri Feb 5 02:10:54 PST 2016

Hi Simon,

On 05.02.2016 01:12, Simon Pilgrim via llvm-commits wrote:
> Author: rksimon
> Date: Thu Feb  4 10:12:56 2016
> New Revision: 259796
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=259796&view=rev
> Log:
> [X86][SSE] Add general 32-bit LOAD + VZEXT_MOVL support to EltsFromConsecutiveLoads
> 
> This patch adds support for consecutive (load/undef elements) 32-bit loads, followed by trailing undef/zero elements to be combined to a single MOVD load.
> 
> Differential Revision: http://reviews.llvm.org/D16729

This change introduced an assertion failure with the Mesa llvmpipe
driver unit test lp_test_format. See below for information about the
CPU, the IR, the assertion failure and the backtrace.

processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 21
model		: 48
model name	: AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G
stepping	: 1
microcode	: 0x6003106
cpu MHz		: 4100.000
cache size	: 2048 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 16
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb bpext arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall fsgsbase bmi1 xsaveopt
bugs		: fxsave_leak sysret_ss_attrs
bogomips	: 8200.55
TLB size	: 1536 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro [13]

define void @fetch_r32_unorm_float(<4 x float>*, i8*, i32, i32, { [2048 x i32], [128 x i64] }*) {
entry:
  %5 = getelementptr i8, i8* %1, i32 0
  %6 = bitcast i8* %5 to i32*
  %7 = load i32, i32* %6
  %8 = insertelement <4 x i32> undef, i32 %7, i32 0
  %9 = shufflevector <4 x i32> %8, <4 x i32> undef, <4 x i32> zeroinitializer
  %10 = lshr <4 x i32> %9, <i32 0, i32 undef, i32 undef, i32 undef>
  %11 = and <4 x i32> %10, <i32 -1, i32 0, i32 0, i32 0>
  %12 = uitofp <4 x i32> %11 to <4 x float>
  %13 = fmul <4 x float> %12, <float 0x3DF0000000000000, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00>
  %14 = shufflevector <4 x float> %13, <4 x float> <float 0.000000e+00, float 1.000000e+00, float undef, float undef>, <4 x i32> <i32 0, i32 4, i32 4, i32 5>
  store <4 x float> %14, <4 x float>* %0
  ret void
}

lp_test_format: ../lib/CodeGen/SelectionDAG/SelectionDAG.cpp:5776: llvm::SDNode* llvm::SelectionDAG::UpdateNodeOperands(llvm::SDNode*, llvm::SDValue, llvm::SDValue): Assertion `N->getNumOperands() == 2 && "Update with wrong number of operands"' failed.

Program received signal SIGABRT, Aborted.
0x00007ffff45a7507 in __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55
55	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007ffff45a7507 in __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007ffff45a88da in __GI_abort () at abort.c:89
#2  0x00007ffff45a059d in __assert_fail_base (fmt=0x7ffff46dd6b8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion at entry=0x7ffff6c99f70 "N->getNumOperands() == 2 && \"Update with wrong number of operands\"", 
    file=file at entry=0x7ffff6c97f40 "../lib/CodeGen/SelectionDAG/SelectionDAG.cpp", line=line at entry=5776, 
    function=function at entry=0x7ffff6ca3bc0 <llvm::SelectionDAG::UpdateNodeOperands(llvm::SDNode*, llvm::SDValue, llvm::SDValue)::__PRETTY_FUNCTION__> "llvm::SDNode* llvm::SelectionDAG::UpdateNodeOperands(llvm::SDNode*, llvm::SDValue, llvm::SDValue)")
    at assert.c:92
#3  0x00007ffff45a0652 in __GI___assert_fail (assertion=assertion at entry=0x7ffff6c99f70 "N->getNumOperands() == 2 && \"Update with wrong number of operands\"", file=file at entry=0x7ffff6c97f40 "../lib/CodeGen/SelectionDAG/SelectionDAG.cpp", line=line at entry=5776, 
    function=function at entry=0x7ffff6ca3bc0 <llvm::SelectionDAG::UpdateNodeOperands(llvm::SDNode*, llvm::SDValue, llvm::SDValue)::__PRETTY_FUNCTION__> "llvm::SDNode* llvm::SelectionDAG::UpdateNodeOperands(llvm::SDNode*, llvm::SDValue, llvm::SDValue)")
    at assert.c:101
#4  0x00007ffff5f87451 in llvm::SelectionDAG::UpdateNodeOperands (this=<optimized out>, N=N at entry=0x8af230, Op1=..., Op2=...) at ../lib/CodeGen/SelectionDAG/SelectionDAG.cpp:5776
#5  0x00007ffff679f276 in <lambda(llvm::EVT, llvm::LoadSDNode*)>::operator()(llvm::EVT, llvm::LoadSDNode *) const (__closure=__closure at entry=0x7fffffffc020, VT=..., LDBase=LDBase at entry=0x8af230) at ../lib/Target/X86/X86ISelLowering.cpp:5616
#6  0x00007ffff67be902 in EltsFromConsecutiveLoads (VT=..., Elts=..., DL=..., DAG=..., isAfterLegalize=true) at ../lib/Target/X86/X86ISelLowering.cpp:5679
#7  0x00007ffff67fc454 in PerformShuffleCombine (N=N at entry=0x861370, DAG=..., DCI=..., Subtarget=...) at ../lib/Target/X86/X86ISelLowering.cpp:24321
#8  0x00007ffff681292d in llvm::X86TargetLowering::PerformDAGCombine (this=<optimized out>, N=0x861370, DCI=...) at ../lib/Target/X86/X86ISelLowering.cpp:28481
#9  0x00007ffff5e70a48 in (anonymous namespace)::DAGCombiner::combine (this=this at entry=0x7fffffffc950, N=N at entry=0x861370) at ../lib/CodeGen/SelectionDAG/DAGCombiner.cpp:1461
#10 0x00007ffff5e72a7e in (anonymous namespace)::DAGCombiner::Run (AtLevel=llvm::AfterLegalizeVectorOps, this=0x7fffffffc950) at ../lib/CodeGen/SelectionDAG/DAGCombiner.cpp:1301
#11 llvm::SelectionDAG::Combine (this=<optimized out>, Level=Level at entry=llvm::AfterLegalizeVectorOps, AA=..., OptLevel=<optimized out>) at ../lib/CodeGen/SelectionDAG/DAGCombiner.cpp:14801
#12 0x00007ffff5fa9555 in llvm::SelectionDAGISel::CodeGenAndEmitDAG (this=this at entry=0x8b4f90) at ../lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:801
#13 0x00007ffff5fa9a85 in llvm::SelectionDAGISel::SelectBasicBlock (this=this at entry=0x8b4f90, Begin=..., Begin at entry=..., End=..., End at entry=..., HadTailCall=@0x7fffffffcec8: false) at ../lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:669
#14 0x00007ffff5fb246b in llvm::SelectionDAGISel::SelectAllBasicBlocks (this=this at entry=0x8b4f90, Fn=...) at ../lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1361
#15 0x00007ffff5fb3def in llvm::SelectionDAGISel::runOnMachineFunction (this=0x8b4f90, mf=...) at ../lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:503
#16 0x00007ffff679a674 in (anonymous namespace)::X86DAGToDAGISel::runOnMachineFunction (this=<optimized out>, MF=...) at ../lib/Target/X86/X86ISelDAGToDAG.cpp:171
#17 0x00007ffff5b43869 in llvm::FPPassManager::runOnFunction (this=0x88b9a0, F=...) at ../lib/IR/LegacyPassManager.cpp:1550
#18 0x00007ffff5b43c1b in llvm::FPPassManager::runOnModule (this=0x88b9a0, M=...) at ../lib/IR/LegacyPassManager.cpp:1571
#19 0x00007ffff5b434a4 in (anonymous namespace)::MPPassManager::runOnModule (M=..., this=<optimized out>) at ../lib/IR/LegacyPassManager.cpp:1627
#20 llvm::legacy::PassManagerImpl::run (this=0x81d7e0, M=...) at ../lib/IR/LegacyPassManager.cpp:1730
#21 0x00007ffff5b4367e in llvm::legacy::PassManager::run (this=this at entry=0x7fffffffd230, M=...) at ../lib/IR/LegacyPassManager.cpp:1761
#22 0x00007ffff66f8958 in llvm::MCJIT::emitObject (this=this at entry=0x86e550, M=M at entry=0x8532d0) at ../lib/ExecutionEngine/MCJIT/MCJIT.cpp:160
#23 0x00007ffff66f90a7 in llvm::MCJIT::generateCodeForModule (this=0x86e550, M=0x8532d0) at ../lib/ExecutionEngine/MCJIT/MCJIT.cpp:203
#24 0x00007ffff66f56c0 in llvm::MCJIT::finalizeObject (this=0x86e550) at ../lib/ExecutionEngine/MCJIT/MCJIT.cpp:251
#25 0x00007ffff66d6a16 in LLVMGetPointerToGlobal (EE=0x86e550, Global=0x837798) at ../lib/ExecutionEngine/ExecutionEngineBindings.cpp:295
#26 0x00000000004e0089 in gallivm_jit_function (gallivm=gallivm at entry=0x80d730, func=func at entry=0x837798) at ../../../../src/gallium/auxiliary/gallivm/lp_bld_init.c:640
#27 0x0000000000405ebe in test_format_float (verbose=0, desc=0x79b480 <util_format_r32_unorm_description>, fp=0x0) at ../../../../../src/gallium/drivers/llvmpipe/lp_test_format.c:157
#28 test_one (verbose=0, format_desc=0x79b480 <util_format_r32_unorm_description>, fp=0x0) at ../../../../../src/gallium/drivers/llvmpipe/lp_test_format.c:336
#29 test_all (verbose=verbose at entry=0, fp=fp at entry=0x0) at ../../../../../src/gallium/drivers/llvmpipe/lp_test_format.c:397
#30 0x0000000000406c72 in test_some (verbose=verbose at entry=0, fp=fp at entry=0x0, n=n at entry=1000) at ../../../../../src/gallium/drivers/llvmpipe/lp_test_format.c:413
#31 0x00000000004059f3 in main (argc=1, argv=0x7fffffffe708) at ../../../../../src/gallium/drivers/llvmpipe/lp_test_main.c:410

-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer