[PATCH] D138107: [AArch64][MachineCombiner] Update isAssociativeAndCommutative
KAWASHIMA Takahiro via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 16 02:23:10 PST 2022
kawashima-fj created this revision.
kawashima-fj added reviewers: dmgreen, fhahn, t.p.northover.
kawashima-fj added a project: Backend.
Herald added subscribers: ctetreau, hiraditya, kristof.beyls.
Herald added a project: All.
kawashima-fj requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.
This commit adds opcodes for `ADD`, `MUL`, `AND`, `ORR`, and `EOR` Base/SIMD/SVE instructions and missing opcodes for `FADD` and `FMUL` FP/SIMD/SVE instructions to the `isAssociativeAndCommutative` function. Also, it removes opcodes for the `FMULX` instruction, which is not associative (bug fix).
This helps increasing instruction-level parallelism by the existing Machine InstCombiner pass. This supersedes D132828 <https://reviews.llvm.org/D132828>, which implements tree height reduction in a new LLVM IR pass. Advantages of using the existing Machine InstCombiner pass are (1) more precise cost estimation, (2) no redundant process, and (3) less compile-time impact. Disadvantages are (4) per-target `isAssociativeAndCommutative` implementation and (4) constraints by the instruction set (see comment for `MULWrr` in `AArch64InstrInfo::isAssociativeAndCommutative`). In addition, (5) the sequence of instructions may not be optimal in some cases in terms of ILP because the algorithm in `TargetInstrInfo::getMachineCombinerPatterns` in the Machine InstCombiner pass is simpler than that of D132828 <https://reviews.llvm.org/D132828>. Nonetheless, it generates a fairly good sequence of instructions.
I run C/C++ benchmarks in SPECrate 2017 on Fujitsu A64FX processor, which has two pipelines for integer operations and SIMD/FP operations each <https://github.com/fujitsu/A64FX/>. 511.povray_r had 4% improvement. Other benchmarks (int: 500, 502, 505, 520, 523, 525, 531, 541, 557; fp: 508, 510, 519, 538, 544) were within 1% up/down. For a synthetic benchmark, it doubled the performance.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D138107
Files:
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll
llvm/test/CodeGen/AArch64/GlobalISel/arm64-pcsections.ll
llvm/test/CodeGen/AArch64/aarch64-dynamic-stack-layout.ll
llvm/test/CodeGen/AArch64/arm64-rev.ll
llvm/test/CodeGen/AArch64/cmp-chains.ll
llvm/test/CodeGen/AArch64/machine-combiner.ll
llvm/test/CodeGen/AArch64/reduce-and.ll
llvm/test/CodeGen/AArch64/reduce-or.ll
llvm/test/CodeGen/AArch64/reduce-shuffle.ll
llvm/test/CodeGen/AArch64/reduce-xor.ll
llvm/test/CodeGen/AArch64/swift-return.ll
llvm/test/CodeGen/AArch64/vecreduce-and-legalization.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D138107.475717.patch
Type: text/x-patch
Size: 57523 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20221116/91c1ff1b/attachment.bin>
More information about the llvm-commits
mailing list