[all-commits] [llvm/llvm-project] afdedd: [AArch64] Try to re-use extended operand for SETCC...

Thu Jul 7 16:51:22 PDT 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: afdedd405e493dc80bd5ceb9133b9d3a8bc69f2c
      https://github.com/llvm/llvm-project/commit/afdedd405e493dc80bd5ceb9133b9d3a8bc69f2c
  Author: Florian Hahn <flo at fhahn.com>
  Date:   2022-07-07 (Thu, 07 Jul 2022)

  Changed paths:
    M llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
    M llvm/test/CodeGen/AArch64/vselect-ext.ll

  Log Message:
  -----------
  [AArch64] Try to re-use extended operand for SETCC with vector ops.

Try to re-use an already extended operand for SetCC with vector operands
feeding an extended select. Doing so avoids requiring another full
extension of the SET_CC result when lowering the select.

This improves lowering for certain extend/cmp/select patterns operating.
For example with  v16i8, this replaces 6 instructions for the extra extension
with 4 separate selects.

This improves the generated code for loops like the one below in
combination with D96522.

    int foo(uint8_t *p, int N) {
      unsigned long long sum = 0;
      for (int i = 0; i < N ; i++, p++) {
        unsigned int v = *p;
        sum += (v < 127) ? v : 256 - v;
      }
      return sum;
    }

https://clang.godbolt.org/z/Wco866MjY

On the AArch64 cores I have access to, the patch improves performance of
the vector loop by ~10%.

This could be generalized per follow-ups, but the initial version
targets one of the more important cases in combination with D96522.

Alive2 modeling:
* sext EQ https://alive2.llvm.org/ce/z/5upBvb
* sext NE https://alive2.llvm.org/ce/z/zbEcJp
* zext EQ https://alive2.llvm.org/ce/z/_xMwof
* zext NE https://alive2.llvm.org/ce/z/5FwKfc
* zext unsigned predicate: https://alive2.llvm.org/ce/z/iEwLU3
* sext signed predicate: https://alive2.llvm.org/ce/z/aMBega

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D120481