[PATCH] D142359: [TTI][AArch64] Cost model vector INS instructions
Sjoerd Meijer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 23 07:04:24 PST 2023
SjoerdMeijer added a comment.
> I see a 2.5% perf uplift for x264 with this on the V1.
I haven't analysed the reasons for this, but it's a nice bonus while making INS a bit cheaper which seems more accurate.
================
Comment at: llvm/test/Analysis/CostModel/AArch64/insert-extract.ll:167
; NEO-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v1 = load i64, ptr %i, align 8
-; NEO-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %v2 = insertelement <2 x i64> %vec, i64 %v1, i32 0
+; NEO-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v2 = insertelement <2 x i64> %vec, i64 %v1, i32 0
; NEO-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret <2 x i64> %v2
----------------
In D141602, we made the indexed LD1 a bit more expensive with `ST->getVectorInsertExtractBaseCost() + 1`, which resulted in this cost here going up from 3 to 4. But because we lower the cost of `getVectorInsertExtractBaseCost()` to 2 in this patch, the cost of indexed LD1 is back to 3.
Don't think I am too unhappy with all of this: INS is a bit cheaper which I think it is or should be, and LD1 is a bit more expensive.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D142359/new/
https://reviews.llvm.org/D142359
More information about the llvm-commits
mailing list