[PATCH] D37522: [AMDGPU] Fixed encoding of v_pk_mul_f16 in fcanonicalize

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 6 11:29:28 PDT 2017


rampitec added inline comments.


================
Comment at: lib/Target/AMDGPU/SIInstructions.td:1325
   (fcanonicalize (v2f16 (VOP3PMods v2f16:$src, i32:$src_mods))),
-  (V_PK_MUL_F16 SRCMODS.OP_SEL_1, (i32 CONST.V2FP16_ONE), $src_mods, $src, DSTCLAMP.NONE)
+  (V_PK_MUL_F16 0, (i32 CONST.V2FP16_ONE), $src_mods, $src, DSTCLAMP.NONE)
 >;
----------------
arsenm wrote:
> Since there isn't really a V2FP16_ONE inline immediate, this should be changed to just FP16_ONE
There is another bug which occurs if I use scalar constant directly, it needs to be fixed before we can change the constant:

```
# After Instruction Selection
# Machine code for function s_test_canonicalize_var_v2f16: IsSSA, TracksLiveness
Function Live Ins: %SGPR0_SGPR1 in %vreg1

BB#0: derived from LLVM BB %0
    Live Ins: %SGPR0_SGPR1
        %vreg1<def> = COPY %SGPR0_SGPR1; SGPR_64:%vreg1
        %vreg4<def> = S_LOAD_DWORDX2_IMM %vreg1, 36, 0; mem:LD8[undef(addrspace=2)](nontemporal)(dereferenceable)(invariant) SReg_64_XEXEC:%vreg4 SGPR_64:%vreg1
        %vreg5<def> = S_LOAD_DWORD_IMM %vreg1, 44, 0; mem:LD4[undef(addrspace=2)](nontemporal)(dereferenceable)(invariant) SReg_32_XM0_XEXEC:%vreg5 SGPR_64:%vreg1
        %vreg6<def> = COPY %vreg4:sub1; SReg_32_XM0:%vreg6 SReg_64_XEXEC:%vreg4
        %vreg7<def> = COPY %vreg4:sub0; SReg_32_XM0:%vreg7 SReg_64_XEXEC:%vreg4
        %vreg8<def> = S_MOV_B32 61440; SReg_32_XM0:%vreg8
        %vreg9<def> = S_MOV_B32 -1; SReg_32_XM0:%vreg9
        %vreg10<def> = REG_SEQUENCE %vreg7<kill>, sub0, %vreg6<kill>, sub1, %vreg9<kill>, sub2, %vreg8<kill>, sub3; SReg_128:%vreg10 SReg_32_XM0:%vreg7,%vreg6,%vreg9,%vreg8
        %vreg11<def> = V_PK_MUL_F16 0, 15360, 8, %vreg5<kill>, 0, 0, 0, 0, 0, %EXEC<imp-use>; VGPR_32:%vreg11 SReg_32_XM0_XEXEC:%vreg5
        BUFFER_STORE_DWORD_OFFSET %vreg11<kill>, %vreg10<kill>, 0, 0, 0, 0, 0, %EXEC<imp-use>; mem:ST4[%out(addrspace=1)] VGPR_32:%vreg11 SReg_128:%vreg10
        S_ENDPGM

# End machine code for function s_test_canonicalize_var_v2f16.

*** Bad machine code: VOP* instruction uses the constant bus more than once ***
- function:    s_test_canonicalize_var_v2f16
- basic block: BB#0  (0x7a5f658)
- instruction: %vreg11<def> = V_PK_MUL_F16
LLVM ERROR: Found 1 machine code errors.
```


https://reviews.llvm.org/D37522





More information about the llvm-commits mailing list