[PATCH] D142359: [TTI][AArch64] Cost model vector INS instructions

Sjoerd Meijer via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 23 07:04:24 PST 2023


SjoerdMeijer added a comment.

> I see a 2.5% perf uplift for x264 with this on the V1.

I haven't analysed the reasons for this, but it's a nice bonus while making INS a bit cheaper which seems more accurate.



================
Comment at: llvm/test/Analysis/CostModel/AArch64/insert-extract.ll:167
 ; NEO-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v1 = load i64, ptr %i, align 8
-; NEO-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v2 = insertelement <2 x i64> %vec, i64 %v1, i32 0
+; NEO-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2 = insertelement <2 x i64> %vec, i64 %v1, i32 0
 ; NEO-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret <2 x i64> %v2
----------------
In D141602, we made the indexed LD1 a bit more expensive with `ST->getVectorInsertExtractBaseCost() + 1`, which resulted in this cost here going up from 3 to 4. But because we lower the cost of `getVectorInsertExtractBaseCost()` to 2 in this patch, the cost of indexed LD1 is back to 3.

Don't think I am too unhappy with all of this: INS is a bit cheaper which I think it is or should be, and LD1 is a bit more expensive. 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142359/new/

https://reviews.llvm.org/D142359



More information about the llvm-commits mailing list