[llvm] [AMDGPU] Create an AMDGPUIfConverter pass (PR #106415)

Juan Manuel Martinez CaamaƱo via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 18 07:09:20 PDT 2024


jmmartinez wrote:

@jayfoad @arsenm 

## If Conversion update

I'm having second thoughts on the implementation of this optimization as a stand-alone optimization, instead of spreading the transformation across other passes.

### Problems 

As it is now, this pass doesn't compose nicely with `SIOptimizeExecMasking`:
To promote `vcmp+s_and_saveexec` into `vcmpx`, there should be no other instruction modifying `exec` after the comparison.
This collides with the restauration of `exec` that we have to do at the end of the `Then` block.
We can work around this by generating directly the right `vcmpx` in this transformation directly.


When I try to generalize this pass from `s_cbranch_scc0/1` to also support `s_cbranch_execz/nz`:
  * Depending on where I put the pass in the pipeline, `exec` branches may appear as `SI_IF`. To overcome this, we have to do this transformation after `SILowerControlFlow`.
  * The first terminator of the basic-block may not be the `s_cbranch_execz/nz`, but the `s_mov_b32_term` before the `s_cbranch`.
    This doesn't play well with the `SSAIfConv` pass that drops the old terminators when it moves the instructions from the `Then` block into the `Head`.
    For example:
    ```asm
    $exec_lo = S_MOV_B32_term killed %32:sreg_32 <- the terminator starts here
    S_CBRANCH_EXECZ %bb.2, implicit $exec
    S_BRANCH %bb.1
    ```

### Alternative solution

I'm thinking that it would be better to spread this transformation in 3 parts:

1. Transform `s_cbranch_scc0/1` into `vcmp + s_and_saveexec + s_cbranch_execz`
2. Let `SIOptimizeExecMasking` optimize `vcmp + s_and_saveexec` pairs into `vcmpx` if applicable
3. If-Convert `s_cbranch_execz`

#### 1. Transform `s_cbranch_scc0/1` example

Transform this:
```asm
bb.0.entry:
  ...
  S_CMP_LT_I32 killed %14:sgpr_32, 1, implicit-def $scc
  S_CBRANCH_SCC1 %bb.2, implicit killed $scc
  S_BRANCH %bb.1

bb.1.if.then:
  ...

bb.2.if.end:
  ...
```

Into this:
```asm
bb.0.entry:
  ...
  $vcc_lo = V_CMP_LT_I32_e64 killed %14:sgpr_32, 1, implicit $exec
  28:sreg_32 = S_AND_SAVEEXEC_B32 killed $vcc_lo, implicit-def $exec, implicit-def dead $scc, implicit $exec # backup exec and mask it
  S_CBRANCH_EXECZ %bb.2, implicit killed $scc
  S_BRANCH %bb.1

bb.1.if.then:
  ...

bb.2.if.end:
  $exec_lo = S_MOV_B32 killed %28:sreg_32 # restore exec
  ...
```

For `s_cbranch_scc0` we would have to reverse the comparison.

#### 3. If-Convert `s_cbranch_execz` example

Transform this:
```asm
bb.0.entry:
  ...
  S_CBRANCH_EXECZ %bb.2, implicit killed $scc
  S_BRANCH %bb.1

bb.1.if.then:
  ..Then..

bb.2.if.end:
  ..End..
```

Into this:
```asm
bb.0.entry:
  ...
  ..Then..
  ..End..
```

Keep in mind that the instructions in the `Then` block should be either speculatable or masked by `exec`, and should not modify `exec`.

## Where to go from here?

I'd appreciate some feedback if you think this is a better/worse idea; where to place this transformations (e.g. some existing pass or where in the pipeline); other potential issues that I've might have missed.

https://github.com/llvm/llvm-project/pull/106415


More information about the llvm-commits mailing list