[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: Handle atomic sextload and zextload (PR #111721)

Matt Arsenault via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Wed Oct 9 10:50:58 PDT 2024


================
@@ -0,0 +1,331 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -global-isel -mtriple=amdgcn-amd-amdhsa -mcpu=kaveri < %s | FileCheck -check-prefixes=GCN,GFX7 %s
+; RUN: llc -global-isel -mtriple=amdgcn-amd-amdhsa -mcpu=fiji < %s | FileCheck -check-prefixes=GCN,GFX8 %s
+; RUN: llc -global-isel -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 < %s | FileCheck -check-prefixes=GCN,GFX9 %s
+
+define i8 @atomic_load_flat_monotonic_i8(ptr %ptr) {
+; GCN-LABEL: atomic_load_flat_monotonic_i8:
+; GCN:       ; %bb.0:
+; GCN-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GCN-NEXT:    flat_load_ubyte v0, v[0:1] glc
+; GCN-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
+; GCN-NEXT:    s_setpc_b64 s[30:31]
+  %load = load atomic i8, ptr %ptr monotonic, align 1
+  ret i8 %load
+}
+
+define i32 @atomic_load_flat_monotonic_i8_zext_to_i32(ptr %ptr) {
+; GCN-LABEL: atomic_load_flat_monotonic_i8_zext_to_i32:
+; GCN:       ; %bb.0:
+; GCN-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GCN-NEXT:    flat_load_ubyte v0, v[0:1] glc
+; GCN-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
+; GCN-NEXT:    s_setpc_b64 s[30:31]
+  %load = load atomic i8, ptr %ptr monotonic, align 1
+  %ext = zext i8 %load to i32
+  ret i32 %ext
+}
+
+define i32 @atomic_load_flat_monotonic_i8_sext_to_i32(ptr %ptr) {
+; GFX7-LABEL: atomic_load_flat_monotonic_i8_sext_to_i32:
+; GFX7:       ; %bb.0:
+; GFX7-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX7-NEXT:    flat_load_sbyte v2, v[0:1] glc
+; GFX7-NEXT:    flat_load_ubyte v0, v[0:1] glc
+; GFX7-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX7-NEXT:    v_mov_b32_e32 v0, v2
+; GFX7-NEXT:    s_setpc_b64 s[30:31]
+;
+; GFX8-LABEL: atomic_load_flat_monotonic_i8_sext_to_i32:
+; GFX8:       ; %bb.0:
+; GFX8-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX8-NEXT:    flat_load_sbyte v2, v[0:1] glc
+; GFX8-NEXT:    flat_load_ubyte v0, v[0:1] glc
+; GFX8-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX8-NEXT:    v_mov_b32_e32 v0, v2
+; GFX8-NEXT:    s_setpc_b64 s[30:31]
+;
+; GFX9-LABEL: atomic_load_flat_monotonic_i8_sext_to_i32:
+; GFX9:       ; %bb.0:
+; GFX9-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX9-NEXT:    flat_load_sbyte v2, v[0:1] glc
----------------
arsenm wrote:

There's a bug in amdgpu-postlegalizer-combiner, so it's not from the selection 

https://github.com/llvm/llvm-project/pull/111721


More information about the llvm-branch-commits mailing list