[all-commits] [llvm/llvm-project] cf14c7: AMDGPU: Add a pass to rewrite certain undef in PHI
Ruiling, Song via All-commits
all-commits at lists.llvm.org
Sun Sep 25 18:56:26 PDT 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: cf14c7caacfc17e893f952dd6d0e31f275302cd6
https://github.com/llvm/llvm-project/commit/cf14c7caacfc17e893f952dd6d0e31f275302cd6
Author: Ruiling Song <ruiling.song at amd.com>
Date: 2022-09-26 (Mon, 26 Sep 2022)
Changed paths:
M llvm/lib/Target/AMDGPU/AMDGPU.h
A llvm/lib/Target/AMDGPU/AMDGPURewriteUndefForPHI.cpp
M llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
M llvm/lib/Target/AMDGPU/CMakeLists.txt
M llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
A llvm/test/CodeGen/AMDGPU/rewrite-undef-for-phi.ll
A llvm/test/CodeGen/AMDGPU/uniform-phi-with-undef.ll
Log Message:
-----------
AMDGPU: Add a pass to rewrite certain undef in PHI
For the pattern of IR (%if terminates with a divergent branch.),
divergence analysis will report %phi as uniform to help optimal code
generation.
```
%if
| \
| %then
| /
%endif: %phi = phi [ %uniform, %if ], [ %undef, %then ]
```
In the backend, %phi and %uniform will be assigned a scalar register.
But the %undef from %then will make the scalar register dead in %then.
This will likely cause the register being over-written in %then. To fix
the issue, we will rewrite %undef as %uniform. For details, please refer
the comment in AMDGPURewriteUndefForPHI.cpp. Currently there is no test
changes shown, but this is mandatory for later changes.
Reviewed by: sameerds
Differential Revision: https://reviews.llvm.org/D133840
Commit: 66325d9ba19dee10adfe587b6c59fad7dc0882bf
https://github.com/llvm/llvm-project/commit/66325d9ba19dee10adfe587b6c59fad7dc0882bf
Author: Ruiling Song <ruiling.song at amd.com>
Date: 2022-09-26 (Mon, 26 Sep 2022)
Changed paths:
A llvm/test/CodeGen/AMDGPU/while-break.ll
Log Message:
-----------
AMDGPU: Add a test to show how later optimization works
Differential Revision: https://reviews.llvm.org/D132448
Commit: 40e9284f3c4c1643ae48afae0658e32d5d39718f
https://github.com/llvm/llvm-project/commit/40e9284f3c4c1643ae48afae0658e32d5d39718f
Author: Ruiling Song <ruiling.song at amd.com>
Date: 2022-09-26 (Mon, 26 Sep 2022)
Changed paths:
M llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
M llvm/test/CodeGen/AMDGPU/loop_break.ll
M llvm/test/CodeGen/AMDGPU/multi-divergent-exit-region.ll
M llvm/test/CodeGen/AMDGPU/multilevel-break.ll
M llvm/test/CodeGen/AMDGPU/nested-loop-conditions.ll
M llvm/test/CodeGen/AMDGPU/si-annotate-cf.ll
M llvm/test/CodeGen/AMDGPU/tuple-allocation-failure.ll
M llvm/test/CodeGen/AMDGPU/vgpr-liverange-ir.ll
M llvm/test/CodeGen/AMDGPU/while-break.ll
M llvm/test/Transforms/StructurizeCFG/AMDGPU/loop-subregion-misordered.ll
M llvm/test/Transforms/StructurizeCFG/interleaved-loop-order.ll
M llvm/test/Transforms/StructurizeCFG/loop-continue-phi.ll
M llvm/test/Transforms/StructurizeCFG/one-loop-multiple-backedges.ll
M llvm/test/Transforms/StructurizeCFG/workarounds/needs-fix-reducible.ll
M llvm/test/Transforms/StructurizeCFG/workarounds/needs-fr-ule.ll
M llvm/test/Transforms/StructurizeCFG/workarounds/needs-unified-loop-exits.ll
Log Message:
-----------
StructurizeCFG: prefer reduced number of live values
The instruction simplification will try to simplify the affected phis.
In some cases, this might extend the liveness of values. For example:
BB0:
| \
| BB1
| /
BB2:phi (BB0, v), (BB1, undef)
The phi in BB2 will be simplified to v as v dominates BB2, but this is
increasing the number of active values in BB1. By setting CanUseUndef
to false, we will not simplify the phi in this way, this would help
register pressure. This is mandatory for the later change to help
reducing VGPR pressure for AMDGPU.
Reviewed by: foad, sameerds
Differential Revision: https://reviews.llvm.org/D132449
Commit: a5676a3a7eab3a295ae0482162089a4e366bf9d2
https://github.com/llvm/llvm-project/commit/a5676a3a7eab3a295ae0482162089a4e366bf9d2
Author: Ruiling Song <ruiling.song at amd.com>
Date: 2022-09-26 (Mon, 26 Sep 2022)
Changed paths:
M llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
M llvm/test/CodeGen/AMDGPU/multilevel-break.ll
M llvm/test/CodeGen/AMDGPU/tuple-allocation-failure.ll
M llvm/test/CodeGen/AMDGPU/while-break.ll
M llvm/test/Transforms/StructurizeCFG/workarounds/needs-fr-ule.ll
M llvm/test/Transforms/StructurizeCFG/workarounds/needs-unified-loop-exits.ll
Log Message:
-----------
StructurizeCFG: Set Undef for non-predecessors in setPhiValues()
During structurization process, we may place non-predecessor blocks
between the predecessors of a block in the structurized CFG. Take
the typical while-break case as an example:
```
/---A(v=...)
| / \
^ B C
| \ /|
\---L |
\ /
E (r = phi (v:C)...)
```
After structurization, the CFG would be look like:
```
/---A
| |\
| | C
| |/
| F1
^ |\
| | B
| |/
| F2
| |\
| | L
\ |/
\--F3
|
E
```
We can see that block B is placed between the predecessors(C/L) of E.
During phi reconstruction, to achieve the same sematics as before, we
are reconstructing the PHIs as:
F1: v1 = phi (v:C), (undef:A)
F3: r = phi (v1:F2), ...
But this is also saying that `v1` would be live through B, which is not
quite necessary. The idea in the change is to say the incoming value
from B is Undef for the PHI in E. With this change, the reconstructed
PHI would be:
F1: v1 = phi (v:C), (undef:A)
F2: v2 = phi (v1:F1), (undef:B)
F3: r = phi (v2:F2), ...
Reviewed by: sameerds
Differential Revision: https://reviews.llvm.org/D132450
Compare: https://github.com/llvm/llvm-project/compare/7a8b9307cad0...a5676a3a7eab
More information about the All-commits
mailing list