[PATCH] D106498: AMDGPU: Treat IMPLICIT_DEF like a constant lanemask source
Ruiling, Song via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jul 23 08:19:59 PDT 2021
ruiling added a comment.
The underlying problem seems need more time to fix. So I think it is ok to let the patch in as a workaround. And it seems make generated code shorter from the test check changes.
================
Comment at: llvm/lib/Target/AMDGPU/SILowerI1Copies.cpp:606
+ // in practice.
unsigned FoundLoopLevel = LF.findLoop(PostDomBound);
----------------
I am not sure what kind of unstructured loops may come here. Maybe we need more test for this pass. But in the test case you provided, I think the loop is just a natural loop (with bb.1 as header and bb.3 as latch/existing). I felt the function name `findLoop` is a little bit confusing here. The reason it returns 0 for the test case is because there is no loop that does not contain `PostDomBound` (that is bb.3 in the test case).
================
Comment at: llvm/test/CodeGen/AMDGPU/lower-i1-copies-implicit-def-unstructured-loop.mir:7
+# When the phi in %bb.3 is handled, it attempted to insert instructions
+# in %bb.1 to handle this def, but ended up inserting mask management
+# instructions before the def of %34. This is avoided by treating
----------------
Yes, the situation sounds a little bit awkward to handle. In fact, we can easily get another failure case by just replacing `%34 = IMPLICIT_DEF` with `%34:sreg_64 = V_CMP_EQ_U32_e64 %17:vgpr_32, 1, implicit $exec`. Could you share with me an IR reproducer for this issue? That may help me better understand the problem.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D106498/new/
https://reviews.llvm.org/D106498
More information about the llvm-commits
mailing list