[PATCH] D20505: Codegen: Outline for chains of tail-duplicable blocks.

Kyle Butt via llvm-commits llvm-commits at lists.llvm.org
Fri May 20 18:33:07 PDT 2016


iteratee created this revision.
iteratee added a reviewer: haicheng.
iteratee added subscribers: llvm-commits, chandlerc, echristo.
iteratee set the repository for this revision to rL LLVM.
Herald added subscribers: dsanders, jyknight, jfb.

This change finds optional blocks within a function, or within a loop
and outlines them later in the loop or the function. It does this only
as long there is a tail-duplicable successor. These blocks are then held
back as long as there are additional tail-duplicable blocks.

Consider the following CFG:

    B   D   F   H
   / \ / \ / \ / \
  A---C---E---G---Ret

Where A,C,E,G are all small (Currently 2 instructions).

The CFG preserving layout is then A,B,C,D,E,F,G,H,Ret.
If we look for opportunities to tail-duplicate, then we can copy C into
B, E into D, and G into H, and produce the following CFG preserving
layout: A,C,B,D,E,G,F,H
This layout produces pairs of small tests followed by pairs of
test-bodies.

This is where the more bold strategy of this patch comes in. We allow E
to be placed, even though its predecessor B (after copying C) is
unplaced, because it is part of the current chain of tail-duplications.
This then produces the layout A,C,E,G,B,D,F,H,Ret. This layout does have
back edges, which is a negative, but it has a bigger compensating
positive, which is that it handles the case where there are long strings
of skipped blocks much better than the original layout. Both layouts
handle runs of executed blocks equally well. Branch prediction also
improves if there is any correlation between subsequent optional blocks.

Benchmark results show excellent improvments on several benchmarks in
the test suite, including 17% on ackerman and 7% on sieve.
Multisource/Applications/lambda shows a slowdown, but it
appears to be DSB/MITE related similar to Ptrdist/ks.test. There weren't
any other obvious regressions in my testing. Over the whole testsuite,
there was a 1.5% size increase due to increased tail-duplication
opportunities.
Two internal micro benchmarks at google show 75% and 35%
improvements in protocol buffer and uncompression respectively.

Repository:
  rL LLVM

http://reviews.llvm.org/D20505

Files:
  include/llvm/CodeGen/TailDuplicator.h
  lib/CodeGen/MachineBlockPlacement.cpp
  lib/CodeGen/TailDuplicator.cpp
  test/CodeGen/AArch64/aarch64-dynamic-stack-layout.ll
  test/CodeGen/AArch64/arm64-atomic.ll
  test/CodeGen/AArch64/arm64-ccmp.ll
  test/CodeGen/AArch64/arm64-extload-knownzero.ll
  test/CodeGen/AArch64/arm64-shrink-wrapping.ll
  test/CodeGen/AArch64/fcmp.ll
  test/CodeGen/AArch64/rm_redundant_cmp.ll
  test/CodeGen/AArch64/tbz-tbnz.ll
  test/CodeGen/ARM/2013-05-05-IfConvertBug.ll
  test/CodeGen/ARM/arm-shrink-wrapping.ll
  test/CodeGen/ARM/atomic-cmpxchg.ll
  test/CodeGen/ARM/atomic-op.ll
  test/CodeGen/ARM/atomic-ops-v8.ll
  test/CodeGen/ARM/fold-stack-adjust.ll
  test/CodeGen/ARM/machine-cse-cmp.ll
  test/CodeGen/Mips/llvm-ir/ashr.ll
  test/CodeGen/Mips/llvm-ir/lshr.ll
  test/CodeGen/Mips/llvm-ir/shl.ll
  test/CodeGen/Mips/longbranch.ll
  test/CodeGen/PowerPC/bdzlr.ll
  test/CodeGen/PowerPC/branch-opt.ll
  test/CodeGen/PowerPC/sjlj.ll
  test/CodeGen/PowerPC/tail-dup-layout.ll
  test/CodeGen/SPARC/sjlj.ll
  test/CodeGen/Thumb/thumb-shrink-wrapping.ll
  test/CodeGen/Thumb2/cbnz.ll
  test/CodeGen/Thumb2/ifcvt-compare.ll
  test/CodeGen/WebAssembly/mem-intrinsics.ll
  test/CodeGen/X86/2012-08-17-legalizer-crash.ll
  test/CodeGen/X86/atom-bypass-slow-division.ll
  test/CodeGen/X86/avx-splat.ll
  test/CodeGen/X86/avx512-cmp.ll
  test/CodeGen/X86/block-placement.ll
  test/CodeGen/X86/cmovcmov.ll
  test/CodeGen/X86/critical-edge-split-2.ll
  test/CodeGen/X86/fp-une-cmp.ll
  test/CodeGen/X86/ragreedy-bug.ll
  test/CodeGen/X86/shrink-wrap-chkstk.ll
  test/CodeGen/X86/statepoint-invoke.ll
  test/CodeGen/X86/twoaddr-coalesce-3.ll
  test/CodeGen/X86/x86-shrink-wrap-unwind.ll
  test/CodeGen/X86/x86-shrink-wrapping.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D20505.58025.patch
Type: text/x-patch
Size: 73795 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160521/f9fb5ddd/attachment-0001.bin>


More information about the llvm-commits mailing list