[llvm] [NVPTX] Customize getScalarizationOverhead (PR #128077)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 21 11:59:16 PST 2025
Artem-B wrote:
> I think your example is a bit misleading because it includes the argument passing convention, if we read the values from gmem instead the mov becomes a single SASS instruction: https://godbolt.org/z/rKeGTrj96
Fair enough. It's still not free as it needs prmt/xmad. I think that cost adjust ment here is a wash. In the end it will end up with about the same instructions on the SASS level, whether we let LLVM construct it by element insertion or by logical ops -- there's just no way to bypass the fact that all registers on SASS level are 32-bit, so we end up shuffling bits, one way or another.
Perhaps I'm missing something? Can you compile the test cases you've added to `v2f16.ll` to PTX, and see what we get for both variants in both PTX and SASS?
https://github.com/llvm/llvm-project/pull/128077
More information about the llvm-commits
mailing list