[llvm] r297695 - In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Vedant Kumar via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 27 16:29:43 PDT 2017
Hi Nirav,
I saw a crasher on bugzilla and narrowed down the issue to this commit. Please see:
https://bugs.llvm.org/show_bug.cgi?id=32610
The reduced test case compiles with r297692 (previous commit), but the crash occurs when r297695 (this commit) is included. The original C file, and a reduced IR file, is attached to the bug report.
Could you take a look?
thanks,
vedant
> On Mar 13, 2017, at 5:34 PM, Nirav Dave via llvm-commits <llvm-commits at lists.llvm.org> wrote:
>
> Author: niravd
> Date: Mon Mar 13 19:34:14 2017
> New Revision: 297695
>
> URL: http://llvm.org/viewvc/llvm-project?rev=297695&view=rev
> Log:
> In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
>
> Recommiting with compiler time improvements
>
> Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner.
>
> * Simplify Consecutive Merge Store Candidate Search
>
> Now that address aliasing is much less conservative, push through
> simplified store merging search and chain alias analysis which only
> checks for parallel stores through the chain subgraph. This is cleaner
> as the separation of non-interfering loads/stores from the
> store-merging logic.
>
> When merging stores search up the chain through a single load, and
> finds all possible stores by looking down from through a load and a
> TokenFactor to all stores visited.
>
> This improves the quality of the output SelectionDAG and the output
> Codegen (save perhaps for some ARM cases where we correctly constructs
> wider loads, but then promotes them to float operations which appear
> but requires more expensive constant generation).
>
> Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)
>
> Additional Minor Changes:
>
> 1. Finishes removing unused AliasLoad code
>
> 2. Unifies the chain aggregation in the merged stores across code
> paths
>
> 3. Re-add the Store node to the worklist after calling
> SimplifyDemandedBits.
>
> 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
> arbitrary, but seems sufficient to not cause regressions in
> tests.
>
> 5. Remove Chain dependencies of Memory operations on CopyfromReg
> nodes as these are captured by data dependence
>
> 6. Forward loads-store values through tokenfactors containing
> {CopyToReg,CopyFromReg} Values.
>
> 7. Peephole to convert buildvector of extract_vector_elt to
> extract_subvector if possible (see
> CodeGen/AArch64/store-merge.ll)
>
> 8. Store merging for the ARM target is restricted to 32-bit as
> some in some contexts invalid 64-bit operations are being
> generated. This can be removed once appropriate checks are
> added.
>
> This finishes the change Matt Arsenault started in r246307 and
> jyknight's original patch.
>
> Many tests required some changes as memory operations are now
> reorderable, improving load-store forwarding. One test in
> particular is worth noting:
>
> CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
> forwarding converts a load-store pair into a parallel store and
> a memory-realized bitcast of the same value. However, because we
> lose the sharing of the explicit and implicit store values we
> must create another local store. A similar transformation
> happens before SelectionDAG as well.
>
> Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle
>
> Added:
> llvm/trunk/test/CodeGen/X86/pr32108.ll
> Removed:
> llvm/trunk/test/CodeGen/X86/combiner-aa-0.ll
> llvm/trunk/test/CodeGen/X86/combiner-aa-1.ll
> llvm/trunk/test/CodeGen/X86/pr18023.ll
> Modified:
> llvm/trunk/include/llvm/Target/TargetLowering.h
> llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp
> llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp
> llvm/trunk/lib/Target/ARM/ARMISelLowering.h
> llvm/trunk/test/CodeGen/AArch64/argument-blocks.ll
> llvm/trunk/test/CodeGen/AArch64/arm64-abi.ll
> llvm/trunk/test/CodeGen/AArch64/arm64-memset-inline.ll
> llvm/trunk/test/CodeGen/AArch64/arm64-variadic-aapcs.ll
> llvm/trunk/test/CodeGen/AArch64/merge-store.ll
> llvm/trunk/test/CodeGen/AArch64/vector_merge_dep_check.ll
> llvm/trunk/test/CodeGen/AMDGPU/debugger-insert-nops.ll
> llvm/trunk/test/CodeGen/AMDGPU/insert_vector_elt.ll
> llvm/trunk/test/CodeGen/AMDGPU/merge-stores.ll
> llvm/trunk/test/CodeGen/AMDGPU/private-element-size.ll
> llvm/trunk/test/CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll
> llvm/trunk/test/CodeGen/ARM/2012-10-04-AAPCS-byval-align8.ll
> llvm/trunk/test/CodeGen/ARM/alloc-no-stack-realign.ll
> llvm/trunk/test/CodeGen/ARM/gpr-paired-spill.ll
> llvm/trunk/test/CodeGen/ARM/ifcvt10.ll
> llvm/trunk/test/CodeGen/ARM/illegal-bitfield-loadstore.ll
> llvm/trunk/test/CodeGen/ARM/static-addr-hoisting.ll
> llvm/trunk/test/CodeGen/BPF/undef.ll
> llvm/trunk/test/CodeGen/MSP430/Inst16mm.ll
> llvm/trunk/test/CodeGen/Mips/cconv/arguments-float.ll
> llvm/trunk/test/CodeGen/Mips/cconv/arguments-varargs.ll
> llvm/trunk/test/CodeGen/Mips/fastcc.ll
> llvm/trunk/test/CodeGen/Mips/load-store-left-right.ll
> llvm/trunk/test/CodeGen/Mips/micromips-li.ll
> llvm/trunk/test/CodeGen/Mips/mips64-f128-call.ll
> llvm/trunk/test/CodeGen/Mips/mips64-f128.ll
> llvm/trunk/test/CodeGen/Mips/mno-ldc1-sdc1.ll
> llvm/trunk/test/CodeGen/Mips/msa/f16-llvm-ir.ll
> llvm/trunk/test/CodeGen/Mips/msa/i5_ld_st.ll
> llvm/trunk/test/CodeGen/Mips/o32_cc_byval.ll
> llvm/trunk/test/CodeGen/Mips/o32_cc_vararg.ll
> llvm/trunk/test/CodeGen/PowerPC/anon_aggr.ll
> llvm/trunk/test/CodeGen/PowerPC/complex-return.ll
> llvm/trunk/test/CodeGen/PowerPC/jaggedstructs.ll
> llvm/trunk/test/CodeGen/PowerPC/ppc64-align-long-double.ll
> llvm/trunk/test/CodeGen/PowerPC/structsinmem.ll
> llvm/trunk/test/CodeGen/PowerPC/structsinregs.ll
> llvm/trunk/test/CodeGen/SystemZ/unaligned-01.ll
> llvm/trunk/test/CodeGen/Thumb/2010-07-15-debugOrdering.ll
> llvm/trunk/test/CodeGen/Thumb/stack-access.ll
> llvm/trunk/test/CodeGen/X86/2010-09-17-SideEffectsInChain.ll
> llvm/trunk/test/CodeGen/X86/2012-11-28-merge-store-alias.ll
> llvm/trunk/test/CodeGen/X86/MergeConsecutiveStores.ll
> llvm/trunk/test/CodeGen/X86/avx-vbroadcast.ll
> llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll
> llvm/trunk/test/CodeGen/X86/chain_order.ll
> llvm/trunk/test/CodeGen/X86/clear_upper_vector_element_bits.ll
> llvm/trunk/test/CodeGen/X86/copy-eflags.ll
> llvm/trunk/test/CodeGen/X86/dag-merge-fast-accesses.ll
> llvm/trunk/test/CodeGen/X86/dont-trunc-store-double-to-float.ll
> llvm/trunk/test/CodeGen/X86/extractelement-legalization-store-ordering.ll
> llvm/trunk/test/CodeGen/X86/i256-add.ll
> llvm/trunk/test/CodeGen/X86/i386-shrink-wrapping.ll
> llvm/trunk/test/CodeGen/X86/live-range-nosubreg.ll
> llvm/trunk/test/CodeGen/X86/longlong-deadload.ll
> llvm/trunk/test/CodeGen/X86/merge-consecutive-loads-128.ll
> llvm/trunk/test/CodeGen/X86/merge-consecutive-loads-256.ll
> llvm/trunk/test/CodeGen/X86/merge-store-partially-alias-loads.ll
> llvm/trunk/test/CodeGen/X86/split-store.ll
> llvm/trunk/test/CodeGen/X86/stores-merging.ll
> llvm/trunk/test/CodeGen/X86/vector-compare-results.ll
> llvm/trunk/test/CodeGen/X86/vector-shuffle-variable-128.ll
> llvm/trunk/test/CodeGen/X86/vector-shuffle-variable-256.ll
> llvm/trunk/test/CodeGen/X86/vectorcall.ll
> llvm/trunk/test/CodeGen/X86/win32-eh.ll
> llvm/trunk/test/CodeGen/XCore/varargs.ll
More information about the llvm-commits
mailing list