[llvm] [NVPTX] fold movs into loads and stores (PR #144581)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 17 12:28:38 PDT 2025
================
@@ -138,9 +138,9 @@ define ptx_kernel void @foo13(ptr noalias readonly %from, ptr %to) {
}
; SM20-LABEL: .visible .entry foo14(
-; SM20: ld.global.v4.b16
+; SM20: ld.global.v2.b32
; SM35-LABEL: .visible .entry foo14(
-; SM35: ld.global.nc.v4.b16
+; SM35: ld.global.nc.v2.b32
----------------
Artem-B wrote:
This switch from v4.b16 ld/st to v2.b32 seems to go in the direction opposite of the other test changes that appear to prefer smaller elements to avoid moves. The test case here probably ended up with 32-bit loads and stores because there were no per-element moves in-between.
It's probably fine, but it would be nice to see all the generated instructions.
https://github.com/llvm/llvm-project/pull/144581
More information about the llvm-commits
mailing list