[llvm] [NVPTX] Add TLI hook for load slice cost and implement it (PR #131847)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 24 11:42:11 PDT 2025
================
@@ -0,0 +1,54 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc < %s | FileCheck %s
+
+target triple = "nvptx64-unknown-unknown"
+
+;; Verify that 64-bit loads are not split into more 32-bit
+;; loads. Loads are more expensive than shifts/conversions.
+define float @test(ptr %in) {
+;
+; CHECK-LABEL: test(
+; CHECK: {
+; CHECK-NEXT: .reg .b32 %r<5>;
+; CHECK-NEXT: .reg .f32 %f<8>;
+; CHECK-NEXT: .reg .b64 %rd<4>;
+; CHECK-EMPTY:
+; CHECK-NEXT: // %bb.0:
+; CHECK-NEXT: ld.param.u64 %rd1, [test_param_0];
+; CHECK-NEXT: ld.u64 %rd2, [%rd1];
+; CHECK-NEXT: ld.u64 %rd3, [%rd1+8];
+; CHECK-NEXT: cvt.u32.u64 %r1, %rd2;
+; CHECK-NEXT: cvt.u32.u64 %r2, %rd3;
+; CHECK-NEXT: mov.b32 %f1, %r1;
+; CHECK-NEXT: mov.b32 %f2, %r2;
+; CHECK-NEXT: add.rn.f32 %f3, %f1, %f2;
+; CHECK-NEXT: { .reg .b32 tmp; mov.b64 {tmp, %r3}, %rd2; }
+; CHECK-NEXT: { .reg .b32 tmp; mov.b64 {tmp, %r4}, %rd3; }
+; CHECK-NEXT: mov.b32 %f4, %r3;
+; CHECK-NEXT: mov.b32 %f5, %r4;
----------------
Artem-B wrote:
Future cleanup opportunity: We seem to be using too many instructions to do effectively nothing on the SASS level. FP operations accept `b32` registers as inputs, so we can skip the moves between .b32 and .f32 registers and just do `add.rn.f32 %f3, %r1, %r2;` That also allows splitting 64-bit values in one mov operation.
https://godbolt.org/z/1bb5dMGsa
```
mov.b64 {%r1, %r3}, %rd2;
mov.b64 {%r2, %r4}, %rd3;
add.rn.f32 %f3, %r1, %r2;
add.rn.f32 %f6, %r3, %r4;
add.rn.f32 %f7, %f3, %f6;
```
Or, alternatively, split `.b64 -> {.f32, .f32}`. That would avoid having to deal with FP ops accepting integers as inputs.: https://godbolt.org/z/64d9of9d6
```
mov.b64 {%f1, %f3}, %rd2;
mov.b64 {%f2, %f4}, %rd3;
add.rn.f32 %f5, %f1, %f2;
add.rn.f32 %f6, %f3, %f4;
add.rn.f32 %f7, %f5, %f6;
```
https://github.com/llvm/llvm-project/pull/131847
More information about the llvm-commits
mailing list