[PATCH] D40008: [X86][TTI] update costs of interleaved load\store of i64\double
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 26 04:39:45 PDT 2021
RKSimon added a comment.
In D40008#2715994 <https://reviews.llvm.org/D40008#2715994>, @lebedev.ri wrote:
> @RKSimon @magabari I'd like to add some more tuples, but i have a question: how are the costs actually derived?
> For example, the assembly for interleaved load of i16 w/ stride 2: https://godbolt.org/z/hjb3d5x6E
> What's it cost? I'm guessing it's not just `10`, aka the instruction count excluding the loads/stores?
> Is it 5 from `Block RThroughput: 4.8` from MCA: https://godbolt.org/z/fxYcEj3Wx ?
> Which CPU should be used for these numbers?
I believe they were taken from IACA probably with a Haswell CPU - a reciprocal throughput from llvm-mca should be similar.
Usually with cost tables we tend to compare numbers from similar spec CPUs (AVX2 - Haswell/Ryzen) and choose the worst.....
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D40008/new/
https://reviews.llvm.org/D40008
More information about the llvm-commits
mailing list