[PATCH] D40008: [X86][TTI] update costs of interleaved load\store of i64\double

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 26 04:39:45 PDT 2021


RKSimon added a comment.

In D40008#2715994 <https://reviews.llvm.org/D40008#2715994>, @lebedev.ri wrote:

> @RKSimon @magabari I'd like to add some more tuples, but i have a question: how are the costs actually derived?
> For example, the assembly for interleaved load of i16 w/ stride 2: https://godbolt.org/z/hjb3d5x6E
> What's it cost? I'm guessing it's not just `10`, aka the instruction count excluding the loads/stores?
> Is it 5 from `Block RThroughput: 4.8` from MCA: https://godbolt.org/z/fxYcEj3Wx ?
> Which CPU should be used for these numbers?

I believe they were taken from IACA probably with a Haswell CPU - a reciprocal throughput from llvm-mca should be similar.

Usually with cost tables we tend to compare numbers from similar spec CPUs (AVX2 - Haswell/Ryzen) and choose the worst.....


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D40008/new/

https://reviews.llvm.org/D40008



More information about the llvm-commits mailing list