[PATCH] D106498: AMDGPU: Treat IMPLICIT_DEF like a constant lanemask source

Ruiling, Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jul 23 08:19:59 PDT 2021


ruiling added a comment.

The underlying problem seems need more time to fix. So I think it is ok to let the patch in as a workaround. And it seems make generated code shorter from the test check changes.



================
Comment at: llvm/lib/Target/AMDGPU/SILowerI1Copies.cpp:606
+    // in practice.
     unsigned FoundLoopLevel = LF.findLoop(PostDomBound);
 
----------------
I am not sure what kind of unstructured loops may come here. Maybe we need more test for this pass. But in the test case you provided, I think the loop is just a natural loop (with bb.1 as header and bb.3 as latch/existing). I felt the function name `findLoop` is a little bit confusing here. The reason it returns 0 for the test case is because there is no loop that does not contain `PostDomBound` (that is bb.3 in the test case).


================
Comment at: llvm/test/CodeGen/AMDGPU/lower-i1-copies-implicit-def-unstructured-loop.mir:7
+# When the phi in %bb.3 is handled, it attempted to insert instructions
+# in %bb.1 to handle this def, but ended up inserting mask management
+# instructions before the def of %34. This is avoided by treating
----------------
Yes, the situation sounds a little bit awkward to handle. In fact, we can easily get another failure case by just replacing `%34 = IMPLICIT_DEF` with `%34:sreg_64 = V_CMP_EQ_U32_e64 %17:vgpr_32, 1, implicit $exec`. Could you share with me an IR reproducer for this issue? That may help me better understand the problem.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106498/new/

https://reviews.llvm.org/D106498



More information about the llvm-commits mailing list