[llvm] [AMDGPU][MC] Fix disassemble of image_gather4 with d16 (PR #114609)
Mirko BrkuĊĦanin via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 5 02:54:04 PST 2024
================
@@ -235,5 +235,5 @@
# VI: image_gather4 v[252:255], v1, s[8:15], s[12:15] dmask:0x3 ; encoding: [0x00,0x03,0x00,0xf1,0x01,0xfc,0x62,0x00]
0x00,0x03,0x00,0xf1,0x01,0xfc,0x62,0x00
-# VI: image_gather4 v[252:255], v1, s[8:15], s[12:15] dmask:0x1 unorm glc slc tfe lwe da ; encoding: [0x00,0x71,0x03,0xf3,0x01,0xfc,0x62,0x00]
+# VI: image_gather4 v[252:253], v1, s[8:15], s[12:15] dmask:0x1 unorm glc slc tfe lwe da ; encoding: [0x00,0x71,0x03,0xf3,0x01,0xfc,0x62,0x00]
----------------
mbrkusanin wrote:
This currently matches sp3. It does fail on encoding as does sp3.
Looking into it further it seems that we do have inconsistent behavior.
We do not decode VOP when a register is out of bounds (gfx1010 example):
```
# v_add_f64 v[255:256], v[1:2], v[2:3]
0xff,0x00,0x64,0xd5,0x01,0x05,0x02,0x00
```
But we do MIMG (gfx1010 example):
```
# image_load v65, v[253:256], s[0:7] dmask:0x8 dim:SQ_RSRC_IMG_2D_MSAA_ARRAY unorm
0x38,0x18,0x00,0xf0,0xfd,0x41,0x00,0x00
```
except llvm produces just v253. Sp3 always decodes to instructions above.
https://github.com/llvm/llvm-project/pull/114609
More information about the llvm-commits
mailing list