[clang] [llvm] [SimplifyCFG] Not folding branch in loop header with constant iterations (PR #74268)

Mon Dec 4 20:10:01 PST 2023

bcl5980 wrote:

> I think we should follow this principle： if a loop required to be unroll later, we should not distroy the loop count info.

The ideal is right. But I think what nikic say is loop unroll should handle the case( upper bound unrolling). But It doesn't work. We need to find why loop unroll doesn't work (maybe in UnrollRuntimeLoopRemainder), then can check if it can do in loop unroll or stop the SimpilfyCfg's transform.

> > AMDGPU can not unorll this case:
> > https://godbolt.org/z/4Pq3bnzTT
> > But the same code in X86 looks can unroll:
> > https://godbolt.org/z/zr8aTG1KW
> > We may need to continue debug on it.
> 
> X86 do very conservative unroll too，its upper bound send to 4 (default is 8), if we not fold the loop branch, it can fully unroll (16)
So where is the different X86 can partial unroll but AMDGPU can not unroll at all?

https://github.com/llvm/llvm-project/pull/74268