[PATCH] D103144: [X86][Costmodel] Load/store v2i16 VF=2 interleaving costs
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed May 26 15:07:29 PDT 2021
lebedev.ri added a comment.
@RKSimon to be most specific: what i want here, is to agree on the algorithm, namely:
1. pick one IR sequence (from the codegen tests i've already added in `llvm/test/CodeGen/X86/vector-interleaved-{load,store}-i16-stride-[2-6].ll`)
2. annotate the [de]interleaving shuffle block with MCA macros
3. for each CPU (that has AVX2 but not AVX512) for which we have sched model:
1. codegen the IR for given CPU
2. do NO manual changes to the produced assembly whatsoever
3. record `Block RThroughput` the MCA produces for that CPU/sched model
4. pick largest recorded `Block RThroughput` (rounding 0.999->1, 1.5->1, 1.50001->2) as the cost for the tuple
... so i can proceed to add the missing costs without nagging on you to double-check the findings.
At least that is what i would like to happen. If that is unfeasible, i can submit them as reviews..
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D103144/new/
https://reviews.llvm.org/D103144
More information about the llvm-commits
mailing list