[PATCH] D99662: [AArch64] Add Machine InstCombiner patterns for FMUL indexed variant
Andrew Savonichev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 9 07:26:43 PDT 2021
asavonic added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:4553
+ case AArch64::FMULv2f32:
+ assert(Root.getOperand(1).isReg() && Root.getOperand(2).isReg());
+ Found = Match(AArch64::DUPv2i32lane, 1, MCP::FMULv2i32_indexed_OP1);
----------------
SjoerdMeijer wrote:
> Nit: this assert can be hoisted out of this switch?
I think we can remove remove the assert completely. The `Match` function checks the same condition already.
================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:4802
+static MachineInstr *genIndexedMultiply(
+ MachineFunction &MF, MachineRegisterInfo &MRI, const TargetInstrInfo *TII,
+ MachineInstr &Root, SmallVectorImpl<MachineInstr *> &InsInstrs,
----------------
SjoerdMeijer wrote:
> Nit: I think you can get the MF via a few `getParent()` calls on `Root`, so you don't have to pass it. From memory I can't remember if that is also true fro MRI and TII, but I don't have strong opinions on this.
Thanks! We can get all three arguments from `Root`.
================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:5790
+ Opc = AArch64::FMULv2i32_indexed;
+ genIndexedMultiply(MF, MRI, TII, Root, InsInstrs, IdxDupOp, Opc, RC);
+ break;
----------------
SjoerdMeijer wrote:
> Nit: perhaps pass `RC` and `Opc` directly in here? Don't really need these assignments?
Done.
================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:5821
+ case MachineCombinerPattern::FMULv8i16_indexed_OP1:
+ case MachineCombinerPattern::FMULv8i16_indexed_OP2: {
+ RC = &AArch64::FPR128RegClass;
----------------
SjoerdMeijer wrote:
> Are we missing tests for these cases?
> For this you'll probably need to add `-mattr=+fullfp16` to a RUN line.
Thanks a lot! I didn't know about `-mattr=+fullfp16`.
================
Comment at: llvm/test/CodeGen/AArch64/arm64-fma-combines.ll:164
+ %add = fadd fast <2 x double> %mul, %ad
+ store <2 x double> %add, <2 x double>* %ret, align 16
+ br label %for.cond
----------------
SjoerdMeijer wrote:
> Nit: can we just do a `ret %add` here?
This an attempt to keep `shuffle` and `mul` in different basic blocks. If we return in the second basic block, the optimizer merges them into one before we reach instruction selection.
In the latest revision of the patch I changed the tests to ensure that this never happens.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99662/new/
https://reviews.llvm.org/D99662
More information about the llvm-commits
mailing list