[llvm] [SLP]Initial support for (masked)loads + compress and (masked)interleaved (PR #132099)
Alexey Bataev via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 7 02:51:54 PDT 2025
alexey-bataev wrote:
> Hi. Probably useless now but here it is:
>
> ```
> ! { dg-do compile }
> ! { dg-options "-O3 -ffast-math -fdump-tree-reassoc1 --param max-completely-peeled-insns=200" }
> subroutine anisonl(w,vo,anisox,s,ii1,jj1,weight)
> integer ii1,jj1,i1,iii1,j1,jjj1,k1,l1,m1,n1
> real*8 w(3,3),vo(3,3),anisox(3,3,3,3),s(60,60),weight
> !
> ! This routine replaces the following lines in e_c3d.f for
> ! an anisotropic material
> !
> do i1=1,3
> iii1=ii1+i1-1
> do j1=1,3
> jjj1=jj1+j1-1
> do k1=1,3
> do l1=1,3
> s(iii1,jjj1)=s(iii1,jjj1)
> & +anisox(i1,k1,j1,l1)*w(k1,l1)*weight
> do m1=1,3
> s(iii1,jjj1)=s(iii1,jjj1)
> & +anisox(i1,k1,m1,l1)*w(k1,l1)
> & *vo(j1,m1)*weight
> & +anisox(m1,k1,j1,l1)*w(k1,l1)
> & *vo(i1,m1)*weight
> do n1=1,3
> s(iii1,jjj1)=s(iii1,jjj1)
> & +anisox(m1,k1,n1,l1)
> & *w(k1,l1)*vo(i1,m1)*vo(j1,n1)*weight
> enddo
> enddo
> enddo
> enddo
> enddo
> enddo
>
> return
> end
>
> ! There should be 22 multiplications left after un-distributing
> ! weigth, w(k1,l1), vo(i1,m1) and vo(j1,m1) on the innermost two
> ! unrolled loops.
>
> ! { dg-final { scan-tree-dump-times "\[0-9\] \\\* " 22 "reassoc1" } }
> ```
>
> `stage2.install/bin/flang -fc1 -triple aarch64-unknown-linux-gnu -emit-obj -mrelocation-model pic -pic-level 2 -pic-is-pie -ffast-math -target-cpu neoverse-512tvb -target-feature +outline-atomics -target-feature +v8.4a -target-feature +aes -target-feature +bf16 -target-feature +ccdp -target-feature +ccidx -target-feature +ccpp -target-feature +complxnum -target-feature +crc -target-feature +dotprod -target-feature +fp-armv8 -target-feature +fp16fml -target-feature +fullfp16 -target-feature +i8mm -target-feature +jsconv -target-feature +lse -target-feature +neon -target-feature +pauth -target-feature +perfmon -target-feature +rand -target-feature +ras -target-feature +rcpc -target-feature +rdm -target-feature +sha2 -target-feature +sha3 -target-feature +sm4 -target-feature +spe -target-feature +ssbs -target-feature +sve -mvscale-max=2 -mvscale-min=2 -vectorize-loops -vectorize-slp -fversion-loops-for-stride -mframe-pointer=non-leaf -mllvm -treat-scalable-fixed-error-as-warning=false -O3 -o reassoc_4.o -x f95-cpp-input reassoc_4.f `
I hope it is fixed already
https://github.com/llvm/llvm-project/pull/132099
More information about the llvm-commits
mailing list