[llvm-dev] Pass ordering - GVN vs. loop optimizations

Thu Dec 21 12:24:09 PST 2017

Hi,

This is Ariel from the Rust team again.

I am having another pass ordering issue. Looking at the pass manager at
https://github.com/llvm-mirror/llvm/blob/7034870f30320d6fbc74effff539d946018cd00a/lib/Transforms/IPO/PassManagerBuilder.cpp
(the early SimplifyCfg now doesn't sink stores anymore! I can't wait until
I can get to use that in rustc!) I find that the loop optimization group
does not run after GVN:

  // Rotate Loop - disable header duplication at -Oz
  MPM.add(createLoopRotatePass(SizeLevel == 2 ? 0 : -1));
  MPM.add(createLICMPass());                  // Hoist loop invariants
  if (EnableSimpleLoopUnswitch)
    MPM.add(createSimpleLoopUnswitchLegacyPass());
  else
    MPM.add(createLoopUnswitchPass(SizeLevel || OptLevel < 3,
DivergentTarget));
  MPM.add(createCFGSimplificationPass());
  addInstructionCombiningPass(MPM);
  MPM.add(createIndVarSimplifyPass());        // Canonicalize indvars
  // <I probably want to add some SimplifyCfg pass here, but
  // that's a separate issue>
  MPM.add(createLoopIdiomPass());             // Recognize idioms like
memset.
  addExtensionsToPM(EP_LateLoopOptimizations, MPM);
  MPM.add(createLoopDeletionPass());          // Delete dead loops
  if (EnableLoopInterchange) {
    MPM.add(createLoopInterchangePass()); // Interchange loops
    MPM.add(createCFGSimplificationPass());
  }
  if (!DisableUnrollLoops)
    MPM.add(createSimpleLoopUnrollPass());    // Unroll small loops
  addExtensionsToPM(EP_LoopOptimizerEnd, MPM);

  // <GVN is now immediately after loop optimizatons

  if (OptLevel > 1) {
    MPM.add(createMergedLoadStoreMotionPass()); // Merge ld/st in diamonds
    MPM.add(NewGVN ? createNewGVNPass()
                   : createGVNPass(DisableGVNLoadPRE)); // Remove
redundancies
  }

This causes a problem, because GVN appears to be the only pass that can
merge loads across basic blocks. This means that if a loop index only
appears behind a pointer, LLVM will not be able to optimize out bounds
checks depending on it.

The "canonical" example is this Rust function (or the similar C code):

#![crate_type="rlib"]
pub fn f(length: &usize) -> u64 {
  let mut sum = 0;
  let len_1 = *length;
  let mut i = 0;
  while i < len_1 {
    let len_2 = *length;
    assert!(i < len_2);
    i += 1;
  }
  sum
}

One would expect the assertion (which in a real example is a bounds check)
to be optimized out. However, IndVarSimplify is the optimization that is
supposed to do that, and it only runs *before* GVN, so it "sees" 2 separate
loads of the length and can't do anything.

Changing the pass ordering to put GVN before the loop optimizations
(reversing the 2 blocks in my excerpt above) fixes this example, so I'm
trying to figure out whether that destroys something important. I'll note
that rustc has another GVN pass "downstream"  (before the
DeadStoreElimination pass) which might help alleviate some worries.

I could not find any good documentation or tests for why the passes are in
the order they are - the order seems to have been preserved from clang in
2009. If you have some good testing infrastructure, it would be nice if
there was a way we could use it to check whether the pass reordering would
hurt anything important.

And in any case, I keep seeing weird optimization misses caused by pass
ordering. It would be nice if there was some better knowledge of this
situation.

Regards,
  - Ariel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171221/a0e78ab3/attachment.html>