[llvm-bugs] [Bug 45660] New: [AMDGPU][MC][GFX9+] Encoding of op_sel_hi for VOP3P inline constants does not match sp3

via llvm-bugs llvm-bugs at lists.llvm.org
Fri Apr 24 03:59:03 PDT 2020


https://bugs.llvm.org/show_bug.cgi?id=45660

            Bug ID: 45660
           Summary: [AMDGPU][MC][GFX9+] Encoding of op_sel_hi for VOP3P
                    inline constants does not match sp3
           Product: libraries
           Version: trunk
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: AMDGPU
          Assignee: unassignedbugs at nondot.org
          Reporter: dpreobrazhensky at luxoft.com
                CC: llvm-bugs at lists.llvm.org

This issue has recently been found by Ilya Perminov.

In short, sp3 and llvm encode op_sel_hi for VOP3P inline constants differently.

For example, sp3 generates the same code for the following lines (gfx9):

  v_pk_fma_f16  v0, v1, -1.0, v3                   // D38E4000 0C0DE701
  v_pk_fma_f16  v0, v1, -1.0, v3 op_sel_hi:[1,1,1] // D38E4000 0C0DE701
  v_pk_fma_f16  v0, v1, -1.0, v3 op_sel_hi:[1,0,1] // D38E4000 0C0DE701

In other words, it silently encodes op_sel_hi:[1,0,1] regardless of what has
been specified in the code. All 3 cases result in the following computations:

  v0.hi = vi.hi * -1.0 + v3.hi
  v0.lo = vi.lo * -1.0 + v3.lo

In contrast, llvm encodes op_sel_hi specified in the code. When omitted, it
results in selection of high 16 bits of -1.0 inline constant for computation of
high bits of the result.

  v_pk_fma_f16 v0, v1, -1.0, v3                   ;
[0x00,0x40,0x8e,0xd3,0x01,0xe7,0x0d,0x1c]
  v_pk_fma_f16 v0, v1, -1.0, v3 op_sel_hi:[1,1,1] ;
[0x00,0x40,0x8e,0xd3,0x01,0xe7,0x0d,0x1c]
  v_pk_fma_f16 v0, v1, -1.0, v3 op_sel_hi:[1,0,1] ;
[0x00,0x40,0x8e,0xd3,0x01,0xe7,0x0d,0x0c]

The last case produces the same code that sp3 does. The first two cases result
in computations different from sp3:

  v0.hi = vi.hi *  0.0 + v3.hi // Probably not what has been intended?
  v0.lo = vi.lo * -1.0 + v3.lo

Note that sp3 uses a different op_sel syntax for gfx10 but this syntax does not
allow to specify bits selection for inline constants. sp3 assembler always
selects low bits for these.

These differences between sp3 and llvm will confuse assembler users. We could
correct llvm assembler to make it behave like sp3 (at least when op_sel_hi is
omitted).

What do you think?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20200424/41a428f5/attachment.html>


More information about the llvm-bugs mailing list