[llvm] [VPlan] Introduce multi-branch recipe, use for multi-exit loops (WIP). (PR #109193)

Sun Oct 13 07:08:40 PDT 2024

fhahn wrote:

> Some thoughts when discussing this briefly with @aniragil.
> 
> Single exiting block: any divergent/non-uniform branch inside a loop is typically if-converted when vectorizing the loop. This includes "break" branches that early exit the loop, whose if-conversion masks all instructions that appear after the break to disable their lanes starting from the (first) one to "break", as pointed out. This also includes BTW divergent loop branches of inner loops when vectorizing an enclosing outer loop. Certain uniform branches may be optimized by retaining instead of if-converting them, see Simon Moll and Sebastian Hack's seminal "Partial Control-flow Linearization" paper from PLDI 2018.
> 
> Single exit block: having a latch block L with one successor (implicit in VPlan) being the header block H and two additional successors being distinct exit blocks E1 and E2, can be modelled with a single exit block E where the latter branches out to E1 and E2. Is it beneficial to fuse E into L, as this patch proposes? If so, could that fusion take place late when preparing VPlan for execution?

Yes that would also be a viable option, might be better as a slightly simpler first step. An earlier version of the patch was previously using multiple 'middle' blocks after the loop to dispatch to the different exit blocks, which overall was simpler due to not requiring new branch recipes (or verifier changes). I put up an updated version of that here: https://github.com/llvm/llvm-project/pull/112138

It indeed might be simpler to start with this and later fold into the vector loop region, if needed (I think I did some small experiments a while ago, and the fused version in the loop latch was marginally faster for the config I tested)

https://github.com/llvm/llvm-project/pull/109193