[llvm] [RISCV] Account for factor in interleave memory op costs (PR #111511)
Philip Reames via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 18 12:06:29 PDT 2024
preames wrote:
Chatted w/Luke about this offline, and settled on an approach. Starting with review just so we're all on the same page:
* Our existing known hardware differs in which segments are fast, but appear to have more or less the same behavior for fast and slow implementations respective. x60 appears to have fast NF=2,3,4, whereas x280 only has fast NF=2.
* The current cost model is better than the best "fast" case. So we're currently under-costing all factors.
Tentative plan is to do the following:
* Implement both a fast and slow variant of the cost model. Default NF2 to the fast one, and all others to the slow. This covers the intersection of fast cases in the wild today. This will be done in this review.
* As a separate follow up change, consider adding tuning flags so that code compiled specifically for the x60 can benefit from fast NF=3,4.
* As a separate follow up change, explore alternate lowerings for NF=2. Even with the "fast" segment load, some preliminary tests seem to indicate that a wide load and two vnrsl are still faster.
https://github.com/llvm/llvm-project/pull/111511
More information about the llvm-commits
mailing list