[PATCH] D37522: [AMDGPU] Fixed encoding of v_pk_mul_f16 in fcanonicalize

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 6 14:40:30 PDT 2017


rampitec added inline comments.


================
Comment at: lib/Target/AMDGPU/SIInstructions.td:1325
   (fcanonicalize (v2f16 (VOP3PMods v2f16:$src, i32:$src_mods))),
-  (V_PK_MUL_F16 SRCMODS.OP_SEL_1, (i32 CONST.V2FP16_ONE), $src_mods, $src, DSTCLAMP.NONE)
+  (V_PK_MUL_F16 0, (i32 CONST.V2FP16_ONE), $src_mods, $src, DSTCLAMP.NONE)
 >;
----------------
rampitec wrote:
> arsenm wrote:
> > Since there isn't really a V2FP16_ONE inline immediate, this should be changed to just FP16_ONE
> There is another bug which occurs if I use scalar constant directly, it needs to be fixed before we can change the constant:
> 
> ```
> # After Instruction Selection
> # Machine code for function s_test_canonicalize_var_v2f16: IsSSA, TracksLiveness
> Function Live Ins: %SGPR0_SGPR1 in %vreg1
> 
> BB#0: derived from LLVM BB %0
>     Live Ins: %SGPR0_SGPR1
>         %vreg1<def> = COPY %SGPR0_SGPR1; SGPR_64:%vreg1
>         %vreg4<def> = S_LOAD_DWORDX2_IMM %vreg1, 36, 0; mem:LD8[undef(addrspace=2)](nontemporal)(dereferenceable)(invariant) SReg_64_XEXEC:%vreg4 SGPR_64:%vreg1
>         %vreg5<def> = S_LOAD_DWORD_IMM %vreg1, 44, 0; mem:LD4[undef(addrspace=2)](nontemporal)(dereferenceable)(invariant) SReg_32_XM0_XEXEC:%vreg5 SGPR_64:%vreg1
>         %vreg6<def> = COPY %vreg4:sub1; SReg_32_XM0:%vreg6 SReg_64_XEXEC:%vreg4
>         %vreg7<def> = COPY %vreg4:sub0; SReg_32_XM0:%vreg7 SReg_64_XEXEC:%vreg4
>         %vreg8<def> = S_MOV_B32 61440; SReg_32_XM0:%vreg8
>         %vreg9<def> = S_MOV_B32 -1; SReg_32_XM0:%vreg9
>         %vreg10<def> = REG_SEQUENCE %vreg7<kill>, sub0, %vreg6<kill>, sub1, %vreg9<kill>, sub2, %vreg8<kill>, sub3; SReg_128:%vreg10 SReg_32_XM0:%vreg7,%vreg6,%vreg9,%vreg8
>         %vreg11<def> = V_PK_MUL_F16 0, 15360, 8, %vreg5<kill>, 0, 0, 0, 0, 0, %EXEC<imp-use>; VGPR_32:%vreg11 SReg_32_XM0_XEXEC:%vreg5
>         BUFFER_STORE_DWORD_OFFSET %vreg11<kill>, %vreg10<kill>, 0, 0, 0, 0, 0, %EXEC<imp-use>; mem:ST4[%out(addrspace=1)] VGPR_32:%vreg11 SReg_128:%vreg10
>         S_ENDPGM
> 
> # End machine code for function s_test_canonicalize_var_v2f16.
> 
> *** Bad machine code: VOP* instruction uses the constant bus more than once ***
> - function:    s_test_canonicalize_var_v2f16
> - basic block: BB#0  (0x7a5f658)
> - instruction: %vreg11<def> = V_PK_MUL_F16
> LLVM ERROR: Found 1 machine code errors.
> ```
I am fixing the legalization now, but what is interesting, this constant is better to be v2 because then it can be inlined. That is why we had no problems with the legalization too.


Repository:
  rL LLVM

https://reviews.llvm.org/D37522





More information about the llvm-commits mailing list