[PATCH] D99662: [AArch64] Add Machine InstCombiner patterns for FMUL indexed variant
Sjoerd Meijer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 9 10:53:37 PDT 2021
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.
Thanks, LGTM
================
Comment at: llvm/test/CodeGen/AArch64/arm64-fma-combines.ll:164
+ %add = fadd fast <2 x double> %mul, %ad
+ store <2 x double> %add, <2 x double>* %ret, align 16
+ br label %for.cond
----------------
asavonic wrote:
> SjoerdMeijer wrote:
> > Nit: can we just do a `ret %add` here?
> This an attempt to keep `shuffle` and `mul` in different basic blocks. If we return in the second basic block, the optimizer merges them into one before we reach instruction selection.
> In the latest revision of the patch I changed the tests to ensure that this never happens.
>
Ok, thanks.
================
Comment at: llvm/test/CodeGen/AArch64/arm64-fma-combines.ll:1
-; RUN: llc < %s -O=3 -mtriple=arm64-apple-ios -mcpu=cyclone -enable-unsafe-fp-math | FileCheck %s
+; RUN: llc < %s -O=3 -mtriple=arm64-apple-ios -mcpu=cyclone -mattr=+fullfp16 -enable-unsafe-fp-math | FileCheck %s
define void @foo_2d(double* %src) {
----------------
I don't know if the cyclone CPU supports FP16, I guess not, but just for testing purposes having it here I think is fine.
================
Comment at: llvm/test/CodeGen/AArch64/machine-combiner-fmul-dup.mir:215
+...
+# CHECK-LABEL: name: indexed_4s
+# CHECK: [[OP1COPY:%.*]]:fpr128 = COPY $q1
----------------
For consistency probably best to add an indexed_8h test here too.If you're confident doing this, you can just do this without another review and commit it, but otherwise happy to look again. Similarly, you will probably need -mattr=+fullfp16 on the RUN line.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99662/new/
https://reviews.llvm.org/D99662
More information about the llvm-commits
mailing list