[PATCH] D40547: AMDGPU: Fix copying i1 value out of loop with non-uniform exit

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 28 04:24:41 PST 2017


nhaehnle created this revision.
Herald added subscribers: t-tye, tpr, dstuttard, yaxunl, mgorny, wdng, kzhuravl.

When an i1-value is defined inside of a loop and used outside of it, we
cannot simply use the SGPR bitmask from the loop's last iteration.

There are also useful and correct cases of an i1-value being copied between
basic blocks, e.g. when a condition is computed outside of a loop and used
inside it. The concept of dominators is not sufficient to capture what is
going on, so I propose the notion of "lane-dominators".

Fixes a bug encountered in Nier: Automata.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103743
Change-Id: If37b969ddc71d823ab3004aeafb9ea050e45bd9a


https://reviews.llvm.org/D40547

Files:
  lib/Target/AMDGPU/SILowerI1Copies.cpp
  lib/Target/AMDGPU/Utils/AMDGPULaneDominator.cpp
  lib/Target/AMDGPU/Utils/AMDGPULaneDominator.h
  lib/Target/AMDGPU/Utils/CMakeLists.txt
  test/CodeGen/AMDGPU/i1-copy-from-loop.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D40547.124543.patch
Type: text/x-patch
Size: 6418 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171128/85effc19/attachment.bin>


More information about the llvm-commits mailing list