[llvm] [NVPTX] Implemented istruncatefree and iszextfree alongwith test cases. (PR #115139)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 6 10:37:49 PST 2024
================
@@ -0,0 +1,54 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -march=nvptx64 < %s | FileCheck %s
+
+; Test for truncation from i64 to i32
+define i32 @test_trunc_i64_to_i32(i64 %val) {
+; CHECK-LABEL: test_trunc_i64_to_i32(
+; CHECK: {
+; CHECK-NEXT: .reg .b32 %r<2>;
+; CHECK-EMPTY:
+; CHECK-NEXT: // %bb.0:
+; CHECK-NEXT: ld.param.u32 %r1, [test_trunc_i64_to_i32_param_0];
+; CHECK-NEXT: st.param.b32 [func_retval0], %r1;
+; CHECK-NEXT: ret;
+ %trunc = trunc i64 %val to i32
+ ret i32 %trunc
+}
+
+; Test for zero-extension from i32 to i64
+define i64 @test_zext_i32_to_i64(i32 %val) {
+; CHECK-LABEL: test_zext_i32_to_i64(
+; CHECK: {
+; CHECK-NEXT: .reg .b64 %rd<2>;
+; CHECK-EMPTY:
+; CHECK-NEXT: // %bb.0:
+; CHECK-NEXT: ld.param.u32 %rd1, [test_zext_i32_to_i64_param_0];
+; CHECK-NEXT: st.param.b64 [func_retval0], %rd1;
+; CHECK-NEXT: ret;
+ %zext = zext i32 %val to i64
+ ret i64 %zext
+}
+
+; Test for operand truncation before select
+define i32 @test_select_truncate_free(i1 %cond, i64 %a, i64 %b) {
+; CHECK-LABEL: test_select_truncate_free(
+; CHECK: {
+; CHECK-NEXT: .reg .pred %p<2>;
+; CHECK-NEXT: .reg .b16 %rs<3>;
+; CHECK-NEXT: .reg .b32 %r<4>;
+; CHECK-EMPTY:
+; CHECK-NEXT: // %bb.0:
+; CHECK-NEXT: ld.param.u8 %rs1, [test_select_truncate_free_param_0];
+; CHECK-NEXT: and.b16 %rs2, %rs1, 1;
+; CHECK-NEXT: setp.eq.b16 %p1, %rs2, 1;
+; CHECK-NEXT: ld.param.u32 %r1, [test_select_truncate_free_param_1];
+; CHECK-NEXT: ld.param.u32 %r2, [test_select_truncate_free_param_2];
----------------
Artem-B wrote:
We're currently generating exactly the same PTX: https://godbolt.org/z/hh5chfvx1
I suspect this code does not really test the changes in this patch and LLVM ends up issuing a truncating load regardless of whether truncation is considered to be free, We need a different source of the input values. E.g. do something with the full 64-bit values so the upper bits can't be easily discarded.
E.g. something like this https://godbolt.org/z/M8915M6sr (though it may not be the right example for triggering isTruncateFree)
https://github.com/llvm/llvm-project/pull/115139
More information about the llvm-commits
mailing list