[PATCH] D61428: AMDGPU: Fix incorrect commute with sub when folding immediates
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu May 2 03:50:46 PDT 2019
arsenm created this revision.
arsenm added a reviewer: rampitec.
Herald added subscribers: t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl.
When a fold of an immediate into a sub/subrev required shrinking the
instruction, the wrong VOP2 opcode was used. This was using the VOP2
equivalent of the original instruction, not the commuted instruction
with the inverted opcode.
https://reviews.llvm.org/D61428
Files:
lib/Target/AMDGPU/SIFoldOperands.cpp
test/CodeGen/AMDGPU/fold-immediate-operand-shrink.mir
Index: test/CodeGen/AMDGPU/fold-immediate-operand-shrink.mir
===================================================================
--- test/CodeGen/AMDGPU/fold-immediate-operand-shrink.mir
+++ test/CodeGen/AMDGPU/fold-immediate-operand-shrink.mir
@@ -250,8 +250,8 @@
; GCN-LABEL: name: shrink_scalar_imm_vgpr_v_sub_i32_e64_no_carry_out_use
; GCN: [[S_MOV_B32_:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 12345
; GCN: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
- ; GCN: [[V_SUBREV_I32_e32_:%[0-9]+]]:vgpr_32 = V_SUBREV_I32_e32 [[S_MOV_B32_]], [[DEF]], implicit-def $vcc, implicit $exec
- ; GCN: S_ENDPGM 0, implicit [[V_SUBREV_I32_e32_]]
+ ; GCN: [[V_SUB_I32_e32_:%[0-9]+]]:vgpr_32 = V_SUB_I32_e32 [[S_MOV_B32_]], [[DEF]], implicit-def $vcc, implicit $exec
+ ; GCN: S_ENDPGM 0, implicit [[V_SUB_I32_e32_]]
%0:sreg_32_xm0 = S_MOV_B32 12345
%1:vgpr_32 = IMPLICIT_DEF
%2:vgpr_32, %3:sreg_64 = V_SUB_I32_e64 %0, %1, 0, implicit $exec
@@ -269,8 +269,8 @@
; GCN-LABEL: name: shrink_vgpr_scalar_imm_v_sub_i32_e64_no_carry_out_use
; GCN: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
; GCN: [[S_MOV_B32_:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 12345
- ; GCN: [[V_SUB_I32_e32_:%[0-9]+]]:vgpr_32 = V_SUB_I32_e32 [[S_MOV_B32_]], [[DEF]], implicit-def $vcc, implicit $exec
- ; GCN: S_ENDPGM 0, implicit [[V_SUB_I32_e32_]]
+ ; GCN: [[V_SUBREV_I32_e32_:%[0-9]+]]:vgpr_32 = V_SUBREV_I32_e32 [[S_MOV_B32_]], [[DEF]], implicit-def $vcc, implicit $exec
+ ; GCN: S_ENDPGM 0, implicit [[V_SUBREV_I32_e32_]]
%0:vgpr_32 = IMPLICIT_DEF
%1:sreg_32_xm0 = S_MOV_B32 12345
%2:vgpr_32, %3:sreg_64 = V_SUB_I32_e64 %0, %1, 0, implicit $exec
@@ -288,8 +288,8 @@
; GCN-LABEL: name: shrink_scalar_imm_vgpr_v_subrev_i32_e64_no_carry_out_use
; GCN: [[S_MOV_B32_:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 12345
; GCN: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
- ; GCN: [[V_SUB_I32_e32_:%[0-9]+]]:vgpr_32 = V_SUB_I32_e32 [[S_MOV_B32_]], [[DEF]], implicit-def $vcc, implicit $exec
- ; GCN: S_ENDPGM 0, implicit [[V_SUB_I32_e32_]]
+ ; GCN: [[V_SUBREV_I32_e32_:%[0-9]+]]:vgpr_32 = V_SUBREV_I32_e32 [[S_MOV_B32_]], [[DEF]], implicit-def $vcc, implicit $exec
+ ; GCN: S_ENDPGM 0, implicit [[V_SUBREV_I32_e32_]]
%0:sreg_32_xm0 = S_MOV_B32 12345
%1:vgpr_32 = IMPLICIT_DEF
%2:vgpr_32, %3:sreg_64 = V_SUBREV_I32_e64 %0, %1, 0, implicit $exec
@@ -307,8 +307,8 @@
; GCN-LABEL: name: shrink_vgpr_scalar_imm_v_subrev_i32_e64_no_carry_out_use
; GCN: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
; GCN: [[S_MOV_B32_:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 12345
- ; GCN: [[V_SUBREV_I32_e32_:%[0-9]+]]:vgpr_32 = V_SUBREV_I32_e32 [[S_MOV_B32_]], [[DEF]], implicit-def $vcc, implicit $exec
- ; GCN: S_ENDPGM 0, implicit [[V_SUBREV_I32_e32_]]
+ ; GCN: [[V_SUB_I32_e32_:%[0-9]+]]:vgpr_32 = V_SUB_I32_e32 [[S_MOV_B32_]], [[DEF]], implicit-def $vcc, implicit $exec
+ ; GCN: S_ENDPGM 0, implicit [[V_SUB_I32_e32_]]
%0:vgpr_32 = IMPLICIT_DEF
%1:sreg_32_xm0 = S_MOV_B32 12345
%2:vgpr_32, %3:sreg_64 = V_SUBREV_I32_e64 %0, %1, 0, implicit $exec
Index: lib/Target/AMDGPU/SIFoldOperands.cpp
===================================================================
--- lib/Target/AMDGPU/SIFoldOperands.cpp
+++ lib/Target/AMDGPU/SIFoldOperands.cpp
@@ -369,7 +369,10 @@
assert(MI->getOperand(1).isDef());
- int Op32 = AMDGPU::getVOPe32(Opc);
+ // Make sure to get the 32-bit version of the commuted opcode.
+ unsigned MaybeCommutedOpc = MI->getOpcode();
+ int Op32 = AMDGPU::getVOPe32(MaybeCommutedOpc);
+
FoldList.push_back(FoldCandidate(MI, CommuteOpNo, OpToFold, true,
Op32));
return true;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D61428.197738.patch
Type: text/x-patch
Size: 3737 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190502/ad0a1c15/attachment.bin>
More information about the llvm-commits
mailing list