[PATCH] D99662: [AArch64] Add Machine InstCombiner patterns for FMUL indexed variant
Sjoerd Meijer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 7 11:23:45 PDT 2021
SjoerdMeijer added a comment.
Sorry for the delay, mostly nits inlined, one question about missing f16 tests.
But the other thing I was just wondering, not that I mind these patterns here, but are we not expecting that the VDUP is sunk to its user? I think that's probably what I would expect, but don't know if that is a fair expectation.
================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:4553
+ case AArch64::FMULv2f32:
+ assert(Root.getOperand(1).isReg() && Root.getOperand(2).isReg());
+ Found = Match(AArch64::DUPv2i32lane, 1, MCP::FMULv2i32_indexed_OP1);
----------------
Nit: this assert can be hoisted out of this switch?
================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:4802
+static MachineInstr *genIndexedMultiply(
+ MachineFunction &MF, MachineRegisterInfo &MRI, const TargetInstrInfo *TII,
+ MachineInstr &Root, SmallVectorImpl<MachineInstr *> &InsInstrs,
----------------
Nit: I think you can get the MF via a few `getParent()` calls on `Root`, so you don't have to pass it. From memory I can't remember if that is also true fro MRI and TII, but I don't have strong opinions on this.
================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:5790
+ Opc = AArch64::FMULv2i32_indexed;
+ genIndexedMultiply(MF, MRI, TII, Root, InsInstrs, IdxDupOp, Opc, RC);
+ break;
----------------
Nit: perhaps pass `RC` and `Opc` directly in here? Don't really need these assignments?
================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:5821
+ case MachineCombinerPattern::FMULv8i16_indexed_OP1:
+ case MachineCombinerPattern::FMULv8i16_indexed_OP2: {
+ RC = &AArch64::FPR128RegClass;
----------------
Are we missing tests for these cases?
For this you'll probably need to add `-mattr=+fullfp16` to a RUN line.
================
Comment at: llvm/test/CodeGen/AArch64/arm64-fma-combines.ll:164
+ %add = fadd fast <2 x double> %mul, %ad
+ store <2 x double> %add, <2 x double>* %ret, align 16
+ br label %for.cond
----------------
Nit: can we just do a `ret %add` here?
================
Comment at: llvm/test/CodeGen/AArch64/arm64-fma-combines.ll:179
+ %add = fadd fast <4 x float> %mul, %ad
+ store <4 x float> %add, <4 x float>* %ret, align 16
+ br label %for.cond
----------------
Same?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99662/new/
https://reviews.llvm.org/D99662
More information about the llvm-commits
mailing list