[PATCH] D135102: [AArch64] Compare BFI and ORR with left-shifted operand for OR instruction selection.

Mingming Liu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 31 15:28:05 PDT 2022


mingmingl updated this revision to Diff 472121.
mingmingl retitled this revision from "[AArch64] Combine or(and(val,shifted-mask), op) to or(shl(and(srl(val,N), mask), N), op)" to "[AArch64] Compare BFI and ORR with left-shifted operand for OR instruction selection.".
mingmingl edited the summary of this revision.
mingmingl added a comment.

In order to generate `orr` with shifted operand (not `bfi`) when `orr` is better, this patch does a comparison inside `tryBitfieldInsertOpFromOr` (i.e., after dag-combiner,  alongside cpp-based instruction selection).

This patch optimizes many existing test cases (as shown by updated tests), but introduces one regression.

- Before the patch, the generation of `BFM` tells the bits being used in `Rd` and `Rm` respectively (see getUsefulBitsFromBFM <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp;rcl=482118374;l=2351>)
- After this patch, `ORRWrs op0, op1, lsl #imm` (with shifted register) is generated rather than `BFM` in some cases; however, the bit field usage information in `op0` is not preserved (and there isn't a way to express this in class `SDNode` without adding a specialized that derives from `SDNode`) (for example LoadSDNode <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h;rcl=483937052;l=2344>)

A side question, is it a typical use case to convey metadata (e.g., op0 and op1 inside `ORR op0, op1` contributes bits that doesn't overlap) in the `SDNode` class?

Two alternative options:

1. Introduce a DAG node for BFM, so as to compare BFM and ORR in dag-combiner.
    - The drawback of generating BFM earlier (i.e., in dag-combiner) is that, all other bit-field processing nodes (AND, SHL, etc) need to be taught to combine with BFM. In other words, introducing a BFM dag-node without regressing existing combination requires a lot of work.
  - I had a local patch that actually adds the BFM node, where missed combinations of BFM with existing nodes manifest.
2. Do the transformation in `aarch64-mi-peephole-opt` pass.
  - Since `aarch64-mi-peephole-opt` optimizes based on ISel output, it's a net optimization (compared with current patch, i.e.,  no drawback of lost information).
  - However, the implementation inside `aarch64-mi-peephole-opt` would handle MachineInstructions (and MachineOpcode, i.e., `ISD::AND` fleshes out in many  forms, like AndWrs, AndWrr, etc), and cannot reuse helper functions in the ISel pass.

I think alternative #2 (inside `aarch64-mi-peephole-opt`) is better than alternative #1 (could be a can of worms due to missed combinations between BFM and existing AND/OR nodes), and in some sense better than the current patch (at the cost of more code work)

Feedback/thoughts on where (peephole or the current patch) to pursue this optimization would be appreciated!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135102/new/

https://reviews.llvm.org/D135102

Files:
  llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
  llvm/test/CodeGen/AArch64/arm64-bitfield-extract.ll
  llvm/test/CodeGen/AArch64/arm64-non-pow2-ldst.ll
  llvm/test/CodeGen/AArch64/arm64-strict-align.ll
  llvm/test/CodeGen/AArch64/arm64_32.ll
  llvm/test/CodeGen/AArch64/bfis-in-loop.ll
  llvm/test/CodeGen/AArch64/bitfield-insert.ll
  llvm/test/CodeGen/AArch64/build-pair-isel.ll
  llvm/test/CodeGen/AArch64/funnel-shift-rot.ll
  llvm/test/CodeGen/AArch64/load-combine-big-endian.ll
  llvm/test/CodeGen/AArch64/load-combine.ll
  llvm/test/CodeGen/AArch64/logic-shift.ll
  llvm/test/CodeGen/AArch64/nontemporal-load.ll
  llvm/test/CodeGen/AArch64/rotate-extract.ll
  llvm/test/CodeGen/AArch64/trunc-to-tbl.ll
  llvm/test/CodeGen/AArch64/urem-seteq.ll
  llvm/test/CodeGen/AArch64/vec_uaddo.ll
  llvm/test/CodeGen/AArch64/vec_umulo.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D135102.472121.patch
Type: text/x-patch
Size: 47648 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20221031/4b745039/attachment-0001.bin>


More information about the llvm-commits mailing list