[llvm] [AMDGPU] Create an AMDGPUIfConverter pass (PR #106415)
Juan Manuel Martinez CaamaƱo via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 18 07:09:20 PDT 2024
jmmartinez wrote:
@jayfoad @arsenm
## If Conversion update
I'm having second thoughts on the implementation of this optimization as a stand-alone optimization, instead of spreading the transformation across other passes.
### Problems
As it is now, this pass doesn't compose nicely with `SIOptimizeExecMasking`:
To promote `vcmp+s_and_saveexec` into `vcmpx`, there should be no other instruction modifying `exec` after the comparison.
This collides with the restauration of `exec` that we have to do at the end of the `Then` block.
We can work around this by generating directly the right `vcmpx` in this transformation directly.
When I try to generalize this pass from `s_cbranch_scc0/1` to also support `s_cbranch_execz/nz`:
* Depending on where I put the pass in the pipeline, `exec` branches may appear as `SI_IF`. To overcome this, we have to do this transformation after `SILowerControlFlow`.
* The first terminator of the basic-block may not be the `s_cbranch_execz/nz`, but the `s_mov_b32_term` before the `s_cbranch`.
This doesn't play well with the `SSAIfConv` pass that drops the old terminators when it moves the instructions from the `Then` block into the `Head`.
For example:
```asm
$exec_lo = S_MOV_B32_term killed %32:sreg_32 <- the terminator starts here
S_CBRANCH_EXECZ %bb.2, implicit $exec
S_BRANCH %bb.1
```
### Alternative solution
I'm thinking that it would be better to spread this transformation in 3 parts:
1. Transform `s_cbranch_scc0/1` into `vcmp + s_and_saveexec + s_cbranch_execz`
2. Let `SIOptimizeExecMasking` optimize `vcmp + s_and_saveexec` pairs into `vcmpx` if applicable
3. If-Convert `s_cbranch_execz`
#### 1. Transform `s_cbranch_scc0/1` example
Transform this:
```asm
bb.0.entry:
...
S_CMP_LT_I32 killed %14:sgpr_32, 1, implicit-def $scc
S_CBRANCH_SCC1 %bb.2, implicit killed $scc
S_BRANCH %bb.1
bb.1.if.then:
...
bb.2.if.end:
...
```
Into this:
```asm
bb.0.entry:
...
$vcc_lo = V_CMP_LT_I32_e64 killed %14:sgpr_32, 1, implicit $exec
28:sreg_32 = S_AND_SAVEEXEC_B32 killed $vcc_lo, implicit-def $exec, implicit-def dead $scc, implicit $exec # backup exec and mask it
S_CBRANCH_EXECZ %bb.2, implicit killed $scc
S_BRANCH %bb.1
bb.1.if.then:
...
bb.2.if.end:
$exec_lo = S_MOV_B32 killed %28:sreg_32 # restore exec
...
```
For `s_cbranch_scc0` we would have to reverse the comparison.
#### 3. If-Convert `s_cbranch_execz` example
Transform this:
```asm
bb.0.entry:
...
S_CBRANCH_EXECZ %bb.2, implicit killed $scc
S_BRANCH %bb.1
bb.1.if.then:
..Then..
bb.2.if.end:
..End..
```
Into this:
```asm
bb.0.entry:
...
..Then..
..End..
```
Keep in mind that the instructions in the `Then` block should be either speculatable or masked by `exec`, and should not modify `exec`.
## Where to go from here?
I'd appreciate some feedback if you think this is a better/worse idea; where to place this transformations (e.g. some existing pass or where in the pipeline); other potential issues that I've might have missed.
https://github.com/llvm/llvm-project/pull/106415
More information about the llvm-commits
mailing list