[PATCH] D132185: [TTI][AArch64] Update vector extract cost for Neoverse-N1.

Thu Aug 18 19:11:09 PDT 2022

mingmingl added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64Subtarget.cpp:191
     MaxBytesForLoopAlignment = 16;
+    VectorInsertExtractBaseCost = 1;
     break;
----------------
nittest nit: this changes cost for both extract and insert, while summary mostly mentions EXT instruction cost. Might be good to call out that INS has a latency of 2 and throughput of 2 (unless it's common assumption that extract and insert instruction have the same cost).

Also, from the studies of D128302, I think the cost of extract/insert is better modeled by considering user instruction into account (e.g., if user instruction can access lane directly, extract could be combined into user in emitted code and have no cost). Nevertheless, my gut feeling is that 3 is a high number (for instructions of latency 2 and throughput 2); not sure if 1 is too small.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132185/new/

https://reviews.llvm.org/D132185