[PATCH] D135102: [AArch64] Combine or(and(val,shifted-mask), op) to or(shl(and(srl(val,N), mask), N), op)

Mingming Liu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 11 10:46:46 PDT 2022


mingmingl added a comment.

An update when trying to figure out the cause of indefinite loop:

1. How indefinite loop take place (diff as it is shown)
  1. By rewriting `AND(val, shifted-mask)` to `shl(and(srl(val,N), mask), N)`, the patch creates two more SDNode in dag-combiner; the added nodes could easily interact badly with existing combining logic, causing a repeat of {`node expansion` (as this patch does), `node combination` (existing logic)} in general. One two-line LLMV IR is attached in [1] to exemplify this.
  2. Indefinite loop could be solved by adding these lines [2] (atop current patch), but it's hard to prove rewriting one dag node to three dag nodes (`AND(val, shifted-mask)` to `shl(and(srl(val,N), mask), N)`) is not fragile (i.e., interact badly with future combination logic)

2. The motivating test case (including {5-line C++  =, current codegen, optimal codegen}) is https://godbolt.org/z/h96b1sGco
  - The source code of over-eager BFM usage is this <https://github.com/llvm/llvm-project/blob/4e5568d92d2cebdb06b4255ecf5e161acff55a2b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp#L2829> in AArch64DAGToDAGISel::Select <https://github.com/llvm/llvm-project/blob/4e5568d92d2cebdb06b4255ecf5e161acff55a2b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp#L2829> -> there is no DAG node for BFM instruction -> the `BFM selection` happens after dag-combiner, and is written in C++ to see through //bit-simplification opportunities// between two operands
  - What //bit-simplification opportunities// means -> getUsefulBits <https://github.com/llvm/llvm-project/blob/4e5568d92d2cebdb06b4255ecf5e161acff55a2b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp#L2821> scans uses of `ISD::OR` (with a limited recursion depth) to shrink the number of useful bits -> if bits could be proved not useful (by users), usage of BFM eliminates DAG nodes and thereby reducing the number of instructions (code <https://github.com/llvm/llvm-project/blob/4e5568d92d2cebdb06b4255ecf5e161acff55a2b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp#L2740-L2744>)

What I learnt from 1 and 2

- //bit-simplification opportunities// from BFM should be retained, since in the best cases it eliminates instructions
- Instruction selection needs to be enhanced to choose between `ORR` (with shifted register) and `BFM` (for the motivating test case in 2); one way to do it, is to introduce SelectionDAG node for 'BFM' instruction and let instructions go through DAG-Combiner for evaluation, and re-write getUsefulBits <https://github.com/llvm/llvm-project/blob/4e5568d92d2cebdb06b4255ecf5e161acff55a2b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp#L2443> function (which relies on MachineOpCode <https://github.com/llvm/llvm-project/blob/4e5568d92d2cebdb06b4255ecf5e161acff55a2b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp#L2398-L2403> (i.e., users of a SDNode already being selected <https://github.com/llvm/llvm-project/blob/4e5568d92d2cebdb06b4255ecf5e161acff55a2b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp#L2398>) now) so it could analyze useful bits when SDNodes are not selected yet.

To enhance ISel to choose between 'ORR' and 'BFM', I'm planning on changes to adding SelectionDAG Node for 'BFM' (and probably UBFM since UBFM is helpful to see through useful bits <https://github.com/llvm/llvm-project/blob/4e5568d92d2cebdb06b4255ecf5e161acff55a2b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp#L2415> ). Going to send them out in stacked diffs..


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135102/new/

https://reviews.llvm.org/D135102



More information about the llvm-commits mailing list