[PATCH] D87936: [GISel] Add new combines for G_ADD

Michael Kitzan via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 18 14:57:28 PDT 2020


mkitzan added a comment.

In D87936#2282772 <https://reviews.llvm.org/D87936#2282772>, @tschuett wrote:

> Out of curiosity, is the first rule a win or canonicalisation?

Primarily canonicalization, however depending on the target there can be perf wins. By eliminating the signed constant, there's a better chance the constant can be folded into the opcode as an immediate operand rather than being loaded into a register before being used (if the target supports immediate operands, otherwise nothing lost, eh?).

Assume hypothetical target only supports 2b immediate operands to arithmetic ops and registers are 8b. Here's an example where the combine is a perf win.

  load r1, -1
  add r0, r1
  ---
  sub r0, 1



================
Comment at: llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h:322
+  /// Transform G_ADD(x, -cst) to G_SUB(x, cst).
+  bool matchAddNegConstant(MachineInstr &MI, int64_t &Cst);
+  bool applyAddNegConstant(MachineInstr &MI, int64_t &Cst);
----------------
arsenm wrote:
> APInt to allow it to work for wide integers?
Make sense, will add


================
Comment at: llvm/test/CodeGen/AArch64/GlobalISel/combine-add.mir:85
+    $w0 = COPY %5
+...
----------------
arsenm wrote:
> Add an s128 case? This currently won't work for vectors, but it would be nice to handle a few vector tests too
Will add an `s128` and a vector test case for each of these


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i128.ll:54-62
+; GFX9-NEXT:    s_lshl_b32 s1, s0, 1
+; GFX9-NEXT:    s_set_gpr_idx_on s1, gpr_idx(SRC0)
 ; GFX9-NEXT:    s_waitcnt vmcnt(0)
 ; GFX9-NEXT:    v_mov_b32_e32 v0, v2
 ; GFX9-NEXT:    v_mov_b32_e32 v1, v3
+; GFX9-NEXT:    s_set_gpr_idx_off
+; GFX9-NEXT:    s_or_b32 s0, s0, 1
----------------
arsenm wrote:
> This is a much more interesting regression
I noticed regressions in this test file too. The combine causing them is the: `G_ADD(x, y)` -> `G_OR(x, y)` (iff x and y share no common bits). That combine rule is the cause of nearly all the AMDGPU test changes (with the exception of three or four tests).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87936/new/

https://reviews.llvm.org/D87936



More information about the llvm-commits mailing list