[llvm-bugs] [Bug 37744] New: [AMDGPU] Potential exec mask issue

via llvm-bugs llvm-bugs at lists.llvm.org
Fri Jun 8 02:24:24 PDT 2018


https://bugs.llvm.org/show_bug.cgi?id=37744

            Bug ID: 37744
           Summary: [AMDGPU] Potential exec mask issue
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: AMDGPU
          Assignee: unassignedbugs at nondot.org
          Reporter: samuel.pitoiset at gmail.com
                CC: llvm-bugs at lists.llvm.org

Created attachment 20411
  --> https://bugs.llvm.org/attachment.cgi?id=20411&action=edit
small LLVM IR testcase

Hi,

The AMDGPU backend seems to be affected by an exec mask issue which ends up by
hanging a bunch of games with RADV and DXVK [1]. I do have a mesa workaround
[2] that I'm going to be push, but the right thing to do is to fix LLVM of
course. Though, that would be needed for LLVM < 7.

I attached a testcase called small-testcase.ll. Basically, the code does
something like:

if x == 0 && y == 0
  result = 0
else
  result = 0x3FAF48604

It first checks both x and y, then x and finally y (yeah, that's dumb but the
LLVM IR is correct). It does return the expected result when x == 0, but it
fails when x != 0.

If we look at the assembly after building with llc there is something
interesting.

In BB0_3, we check if x and y == 0, if it's true we jump to BB0_7, otherwise we
end up in BB0_4. In this block we check x, if x == 0 we execute the
instructions in BB0_5 (because EXEC != 0). When x != 0, these instructions are
implicitly "skipped", except s_or_b64 which sets s[10:11] to 0 because VCC is
0. But VCC was 1 in BB0_3, hmm.

Later on, in BB0_7 there is:
        s_mov_b64 s[6:7], s[10:11]

But s[10:11] is set in two blocks (BB0_3 and BB0_5), looks weird. Well, it's a
PHI of SI_IF_BREAK and we have two breaks in that loop, maybe related?

One solution is to *explicitly* skip BB0_5 when EXEC == 0, that way we don't
override s[10:11]. I'm really not sure if that the best option but this is what
my workaround does, and it works.

Anyway, SIInsertSkips looks completely broken, we might want to insert the
skips in SILowerControlFlow and optimize later on.

I'm using latest LLVM trunk.

Any thoughts?

[1] https://github.com/doitsujin/dxvk/issues/252#issuecomment-395527247
[2] https://bugs.freedesktop.org/attachment.cgi?id=140068

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180608/3073751d/attachment.html>


More information about the llvm-bugs mailing list