[PATCH] D142359: [TTI][AArch64] Cost model vector INS instructions

Mon Feb 20 04:44:28 PST 2023

dmgreen added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64Subtarget.cpp:216-217
     PrefLoopLogAlignment = 5;
     MaxBytesForLoopAlignment = 16;
+    VectorInsertExtractBaseCost = 2;
     break;
----------------
ktkachov wrote:
> Beyond David's comments I'll note that there are other microarchitectures that are similar to N1 and so should benefit from this. Particularly Cortex-A76 and later CPUs in that family
Yeah - I would say that we should be changing this globally for all cores under aarch64, not just some. As in we change the default value to 2. (In the tests I ran the in-order cores actually did better than out of order!). I believe this cost is more about controlling the code-quality of SLP code than the actual cost of insert/extracts. We need to get the performance better overall first though.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142359/new/

https://reviews.llvm.org/D142359