[llvm] [AMDGPU][MC] Fix disassembler problem for image_atomic with TFE (PR #112622)
Jun Wang via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 22 17:17:25 PDT 2024
================
@@ -192,10 +271,10 @@
# VI: image_atomic_add v5, v1, s[8:15] dmask:0x2 unorm ; encoding: [0x00,0x12,0x48,0xf0,0x01,0x05,0x02,0x00]
0x00,0x12,0x48,0xf0,0x01,0x05,0x02,0x00
-# VI: image_atomic_add v5, v1, s[8:15] dmask:0x7 unorm ; encoding: [0x00,0x17,0x48,0xf0,0x01,0x05,0x02,0x00]
+# VI: image_atomic_add v[5:7], v1, s[8:15] dmask:0x7 unorm ; encoding: [0x00,0x17,0x48,0xf0,0x01,0x05,0x02,0x00]
0x00,0x17,0x48,0xf0,0x01,0x05,0x02,0x00
-# VI: image_atomic_add v[5:9], v1, s[8:15] dmask:0xf unorm ; encoding: [0x00,0x1f,0x48,0xf0,0x01,0x05,0x02,0x00]
+# VI: image_atomic_add v5, v1, s[8:15] dmask:0xf unorm ; encoding: [0x00,0x1f,0x48,0xf0,0x01,0x05,0x02,0x00]
----------------
jwanggit86 wrote:
Mirko is right. For non-cmpswap atomics, dmask can only be 0x1 or 0x3. So the destination reg can be 96b at most (dmask=0x3 + tfe). In the above example, the binary is incorrect to begin with (because dmask=0xf), the disassemble result is also incorrect.
https://github.com/llvm/llvm-project/pull/112622
More information about the llvm-commits
mailing list