[all-commits] [llvm/llvm-project] 6efcc2: [Test] Add negative tests where usub optimization ...

max-azul via All-commits all-commits at lists.llvm.org
Wed Feb 10 21:00:17 PST 2021


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 6efcc2fd3f138160a710f3c152ee1c54c2e50420
      https://github.com/llvm/llvm-project/commit/6efcc2fd3f138160a710f3c152ee1c54c2e50420
  Author: Max Kazantsev <mkazantsev at azul.com>
  Date:   2021-02-11 (Thu, 11 Feb 2021)

  Changed paths:
    M llvm/test/CodeGen/X86/usub_inc_iv.ll

  Log Message:
  -----------
  [Test] Add negative tests where usub optimization should not apply


  Commit: 3d15b7e7dfc3e2cefc47791d1e8d95909e937842
      https://github.com/llvm/llvm-project/commit/3d15b7e7dfc3e2cefc47791d1e8d95909e937842
  Author: Max Kazantsev <mkazantsev at azul.com>
  Date:   2021-02-11 (Thu, 11 Feb 2021)

  Changed paths:
    M llvm/lib/CodeGen/CodeGenPrepare.cpp
    M llvm/test/CodeGen/X86/2020_12_02_decrementing_loop.ll
    M llvm/test/CodeGen/X86/lsr-loop-exit-cond.ll
    M llvm/test/CodeGen/X86/usub_inc_iv.ll

  Log Message:
  -----------
  [Codegenprepare][X86] Use usub with overflow opt for IV increment

Function `replaceMathCmpWithIntrinsic` artificially limits the scope
of the optimization, setting a requirement of two instructions be in
the same block, due to two reasons:
- usage of DT for more general check is costly in terms of compile time;
- risk of creating a new value that lives through multiple blocks.

Because of this, two semantically equivalent tests may be or not be the
subject of this opt depending on where the binary operation is located.
See `test/CodeGen/X86/usub_inc_iv.ll` for motivation

There is one important particular case where this limitation is  too strict:
it is when the binary operation is the increment of the induction variable.
As result, the application of this opt becomes fragile and highly reliant on
where other passes decide to place IV increment. In most cases, they place
it in the end of the latch block, killing the opt opportunity (when in fact it
does not matter where to insert the actual instruction).

This patch handles this particular case separately.
- The detector does not use dom tree and has constant cost;
- The value of IV or IV.next lives through all loop in any case, so this should not
  create a new unexpected long-living value.

As result, the transform becomes more robust. It also seems to lead to
better code generation in some cases (see `test/CodeGen/X86/lsr-loop-exit-cond.ll`).

Differential Revision: https://reviews.llvm.org/D96119
Reviewed By: spatel, reames


Compare: https://github.com/llvm/llvm-project/compare/984cfdc6ee8b...3d15b7e7dfc3


More information about the All-commits mailing list