[llvm] [LoongArch] Enable LoopTermFold Pass (PR #130737)
Lu Weining via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 13 00:27:29 PDT 2025
SixWeining wrote:
> > What's the effect after enabling this pass?
>
> This will reduce one `addi` instruction in loop.body but with extra setup cost. For most cases, this keep the same behavior with gcc.
>
> clang temp.c -S -O1
>
> ```c
> void foo(int *__restrict a, short int * __restrict b, int n) {
> for(int i = 0 ; i < n; i++ )
> a[i] = b[i];
> }
> ```
>
> before
>
> ```assembly
> # %bb.0:
> ori $a3, $zero, 1
> blt $a2, $a3, .LBB0_2
> .p2align 4, , 16
> .LBB0_1: # =>This Inner Loop Header: Depth=1
> ld.h $a3, $a1, 0
> st.w $a3, $a0, 0
> addi.d $a0, $a0, 4
> addi.d $a2, $a2, -1
> addi.d $a1, $a1, 2
> bnez $a2, .LBB0_1
> .LBB0_2:
> ret
> ```
>
> after
>
> ```assembly
> # %bb.0: # %entry
> ori $a3, $zero, 1
> blt $a2, $a3, .LBB0_3
> # %bb.1: # %for.body.preheader
> alsl.d $a2, $a2, $a0, 2
> .p2align 4, , 16
> .LBB0_2: # %for.body
> # =>This Inner Loop Header: Depth=1
> ld.h $a3, $a1, 0
> st.w $a3, $a0, 0
> addi.d $a0, $a0, 4
> addi.d $a1, $a1, 2
> bne $a0, $a2, .LBB0_2
> .LBB0_3: # %for.cond.cleanup
> ret
> ```
I see. Could you pre-commit a dedicate IR testcase in llvm/test/CodeGen/LoongArch?
https://github.com/llvm/llvm-project/pull/130737
More information about the llvm-commits
mailing list