[llvm] [Transforms/Util] Add SimplifySwitchVar pass (PR #149937)

Fri Jul 25 13:49:57 PDT 2025

nikic wrote:

> > At a high level, I don't think that this is the correct approach to the problem. We should be sinking those GEPs such that there is one GEP with a phi operand, and then existing switch simplification will take care of the rest.
> 
> So I think we missed that a similar switch simplification already exists because there are fairly straightforward cases that are currently not simplified which are relevant for what we're targeting here.
> 
> It may still be the case that the "correct" answer is:
> 
>     * Generalize the existing simplification so that it can handle these cases
> 
>     * Strengthen the sinking of `getelementptr`s
> 
> 
> I'm a little worried however that the relative tradeoff of when these transforms make sense is quite different across architectures. Do you have any thoughts about that?
> 
> Here's a case that currently isn't simplified:
> 
> ```
> define i32 @test(i32 %x) {
> entry:
>   switch i32 %x, label %end [
>     i32 0, label %case0
>     i32 1, label %case1
>     i32 2, label %case2
>     i32 3, label %case3
>   ]
> 
> case0:
>   br label %end
> case1:
>   br label %end
> case2:
>   br label %end
> case3:
>   br label %end
> 
> end:
>   %idx = phi i32 [ 0, %case0 ], [ 3, %case1 ], [ 6, %case2 ], [ 9, %case3 ], [ 12, %entry ]
>   ret i32 %idx
> }
> ```
> 
> This one is pretty straightforward.

This case is already simplified -- this is one of the most basic cases. The relevant code is behind `fitsInLegalInteger`, so you'll have to specify a triple or `n` data layout to get it to actually work.

> The following is less straightforward:
> 
> ```
> declare void @foo1()
> declare void @foo2()
> declare void @foo3()
> declare void @foo4()
> 
> define i32 @test(i32 %x) {
> entry:
>   switch i32 %x, label %end [
>     i32 0, label %case0
>     i32 1, label %case1
>     i32 2, label %case2
>     i32 3, label %case3
>   ]
> 
> case0:
>   call void @foo1()
>   br label %end
> case1:
>   call void @foo2()
>   br label %end
> case2:
>   call void @foo3()
>   br label %end
> case3:
>   call void @foo4()
>   br label %end
> 
> end:
>   %idx = phi i32 [ 0, %case0 ], [ 1, %case1 ], [ 2, %case2 ], [ 3, %case3 ], [ 4, %entry ]
>   ret i32 %idx
> }
> ```
> 
> (The calls stand in for arbitrary code.)
> 
> In order to handle this case, we have to extend the lifetime of `%idx` or `%x`. This is very typically worth it in the case we're targeting (on AMDGPU, which has lots of registers), but may be less of a slam dunk on other architectures.
> 
> Going at it from the other end, do you think there's a reason why the existing instruction sinking prefers not to sink those longer sequences of instructions? That was really my main concern going into this.
> 
> It's generally quite difficult to do any of this in a way that makes all targets happy. That's really the biggest reason why this ended up as a separate pass.

I think we'd transform that case without the entry edge.

https://github.com/llvm/llvm-project/pull/149937