[PATCH] D117003: [SchedModels][CortexA55] Add ASIMD integer instructioins
Pavel Kosov via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 7 22:36:42 PST 2022
kpdev42 added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64SchedA55.td:494
+// COPY
+def : InstRW<[CortexA55WriteCOPY], (instrs COPY)>;
}
----------------
dmgreen wrote:
> Does this add a lot? It's not really how COPYs work.
According to our experiments FPU copy (fmov) has latency of 1 cycle and throughput of 2 or 1 (Q-form). According to model integer ALU copy has 3 cycle latency. What would be correct model for COPY in your opinion?
================
Comment at: llvm/test/tools/llvm-mca/AArch64/Cortex/A55-neon-instructions.s:2506
# CHECK-NEXT: - - - - - - - - - 2.00 - - ld4r { v0.2s, v1.2s, v2.2s, v3.2s }, [sp], x30
-# CHECK-NEXT: - - - - 0.50 0.50 - - - - - - mla v0.8b, v0.8b, v0.8b
-# CHECK-NEXT: - - - - 0.50 0.50 - - - - - - mls v0.4h, v0.4h, v0.4h
+# CHECK-NEXT: - - - - - - - 0.50 0.50 - - - mla v0.8b, v0.8b, v0.8b
+# CHECK-NEXT: - - - - - - - 0.50 0.50 - - - mls v0.4h, v0.4h, v0.4h
----------------
dmgreen wrote:
> What is the reasoning for the integer multiplies going down the FPMAC pipeline?
I guess mla/mls (ASIMD multiply/accumulate) utilize NEON pipeline. For some reason 2 NEON pipelines of Cortex-A55 are modelled with 5 pipelines (2 x FPALU, 2 x FPMAC, 1 x FPDIV). What you think would be correct resource assignment for mla/mls?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D117003/new/
https://reviews.llvm.org/D117003
More information about the llvm-commits
mailing list