[llvm] [LoongArch] Optimize extractelement containing variable index (PR #151475)
via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 2 20:03:52 PDT 2025
tangaac wrote:
> > Are these okay?
> > ```
> > xr0: v32i8
> > a0 : index
> > L/T
> > xvpermi.q xr1, xr0, 1 3/4
> > xvreplgr2vr.b xr2, a0 3/1 // movgr2fr.w f2, a0 2/1
> > xvshuf.b xr2, xr1, xr0, xr2 1/2
> > Total: 7/1
> >
> > xr0: v16i16
> > a0 : index
> > L/T
> > xvpermi.q xr1, xr0, 1 3/4
> > xvreplgr2vr.h xr2, a0 3/1 // movgr2fr.w f2, a0 2/1
> > xvshuf.h xr2, xr1, xr0 1/2
> > Total: 7/1
> >
> > xr0: v4i64
> > a0 : index
> > L/T
> > xvpermi.q xr1, xr0, 1 3/4
> > xvreplgr2vr.d xr2, a0 2/1
> > xvshuf.d xr2, xr1, xr0 1/2
> > Total: 6/1
> > ```
>
> I think it is right and better. My thought process was entirely limited by `xvperm.w`. Thanks for your idea.
>
> What do you think about this? @tangaac
It shows better performance.
https://github.com/llvm/llvm-project/pull/151475
More information about the llvm-commits
mailing list