[PATCH] D121794: [AMDGPU][MachineVerifier] Alignment check for fp32 packed math instructions
Christudasan Devadasan via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 16 05:49:12 PDT 2022
cdevadas created this revision.
cdevadas added reviewers: rampitec, arsenm.
Herald added subscribers: foad, kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl.
Herald added a project: All.
cdevadas requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.
The fp32 packed math instructions are introduced in gfx90a.
If their vector register operands are not properly aligned, the
verifier should flag them. Currently, the verifier failed to
report it and the compiler ended up emitting a broken assembly.
This patch fixes that missed case in `TII::verifyInstruction`.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D121794
Files:
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
llvm/test/CodeGen/AMDGPU/verify-gfx90a-aligned-vgprs.mir
Index: llvm/test/CodeGen/AMDGPU/verify-gfx90a-aligned-vgprs.mir
===================================================================
--- llvm/test/CodeGen/AMDGPU/verify-gfx90a-aligned-vgprs.mir
+++ llvm/test/CodeGen/AMDGPU/verify-gfx90a-aligned-vgprs.mir
@@ -109,6 +109,35 @@
%11:areg_128_align2 = IMPLICIT_DEF
DS_WRITE_B64_gfx9 %9, %10, 0, 0, implicit $exec
DS_WRITE_B64_gfx9 %9, %11.sub1_sub2, 0, 0, implicit $exec
+
+ ; Check aligned vgprs for FP32 Packed Math instructions.
+ ; CHECK: *** Bad machine code: Subtarget requires even aligned vector registers ***
+ ; CHECK: *** Bad machine code: Subtarget requires even aligned vector registers ***
+ ; CHECK: *** Bad machine code: Subtarget requires even aligned vector registers ***
+ ; CHECK: *** Bad machine code: Subtarget requires even aligned vector registers ***
+ ; CHECK: *** Bad machine code: Subtarget requires even aligned vector registers ***
+ ; CHECK: *** Bad machine code: Subtarget requires even aligned vector registers ***
+ ; CHECK: *** Bad machine code: Subtarget requires even aligned vector registers ***
+ ; CHECK: *** Bad machine code: Subtarget requires even aligned vector registers ***
+ ; CHECK: *** Bad machine code: Subtarget requires even aligned vector registers ***
+ ; CHECK: *** Bad machine code: Subtarget requires even aligned vector registers ***
+ ; CHECK: *** Bad machine code: Subtarget requires even aligned vector registers ***
+ ; CHECK: *** Bad machine code: Subtarget requires even aligned vector registers ***
+ %12:vreg_64 = IMPLICIT_DEF
+ %13:vreg_64_align2 = IMPLICIT_DEF
+ %14:areg_96_align2 = IMPLICIT_DEF
+ $vgpr3_vgpr4 = V_PK_MOV_B32 8, 0, 8, 0, 0, 0, 0, 0, 0, implicit $exec
+ $vgpr0_vgpr1 = V_PK_ADD_F32 0, %12, 11, %13, 0, 0, 0, 0, 0, implicit $mode, implicit $exec
+ $vgpr0_vgpr1 = V_PK_ADD_F32 0, %13, 11, %12, 0, 0, 0, 0, 0, implicit $mode, implicit $exec
+ $vgpr0_vgpr1 = V_PK_ADD_F32 0, %13, 11, %14.sub1_sub2, 0, 0, 0, 0, 0, implicit $mode, implicit $exec
+ $vgpr0_vgpr1 = V_PK_ADD_F32 0, %14.sub1_sub2, 11, %13, 0, 0, 0, 0, 0, implicit $mode, implicit $exec
+ $vgpr0_vgpr1 = V_PK_MUL_F32 0, %12, 11, %13, 0, 0, 0, 0, 0, implicit $mode, implicit $exec
+ $vgpr0_vgpr1 = V_PK_MUL_F32 0, %13, 11, %12, 0, 0, 0, 0, 0, implicit $mode, implicit $exec
+ $vgpr0_vgpr1 = V_PK_MUL_F32 0, %13, 11, %14.sub1_sub2, 0, 0, 0, 0, 0, implicit $mode, implicit $exec
+ $vgpr0_vgpr1 = V_PK_MUL_F32 0, %14.sub1_sub2, 11, %13, 0, 0, 0, 0, 0, implicit $mode, implicit $exec
+ $vgpr0_vgpr1 = nofpexcept V_PK_FMA_F32 8, %12, 8, %13, 11, %14.sub0_sub1, 0, 0, 0, 0, 0, implicit $mode, implicit $exec
+ $vgpr0_vgpr1 = nofpexcept V_PK_FMA_F32 8, %13, 8, %12, 11, %14.sub0_sub1, 0, 0, 0, 0, 0, implicit $mode, implicit $exec
+ $vgpr0_vgpr1 = nofpexcept V_PK_FMA_F32 8, %13, 8, %13, 11, %14.sub1_sub2, 0, 0, 0, 0, 0, implicit $mode, implicit $exec
...
# FIXME: Inline asm is not verified
Index: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -4051,6 +4051,7 @@
case AMDGPU::OPERAND_REG_IMM_INT32:
case AMDGPU::OPERAND_REG_IMM_FP32:
case AMDGPU::OPERAND_REG_IMM_FP32_DEFERRED:
+ case AMDGPU::OPERAND_REG_IMM_V2FP32:
break;
case AMDGPU::OPERAND_REG_INLINE_C_INT32:
case AMDGPU::OPERAND_REG_INLINE_C_FP32:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D121794.415791.patch
Type: text/x-patch
Size: 3474 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220316/4b19524c/attachment.bin>
More information about the llvm-commits
mailing list