[llvm] [AArch64] Fix throughout of 64-bit SVE gather loads (PR #168572)
Asher Dobrescu via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 19 03:44:48 PST 2025
Asher8118 wrote:
> > > why isn't it possible to get the correct throughput with the existing resources?
> >
> >
> > Because the pipeline used by gather loads is unit L, which has 3 resources. This makes it so the throughput is a result of a division by 3.
>
> if it's not possible to get that with the resources as documented in the SWOG.
I reasoned it would be a similar case as for flag setting instructions for V cores where we use [V#UnitFlg](https://github.com/llvm/llvm-project/blob/b42851b8dda8c85a277573610519e8c66e91322f/llvm/lib/Target/AArch64/AArch64SchedNeoverseV3.td#L58C1-L58C43), which is also a resource that does not appear in the SWOG.
>Also "Non temporal gather load, vector + scalar 32-bit element size" is 4 micro-ops whereas 64-bit element size is 2 micro-ops, that doesnt make sense.
That is odd, I think the micro-ops number should be the same for both 32-bit and 64-bit. I can change that as part of this patch.
> Looking at the other neoverse cores they all use some of the vector pipes for these gathers, are we sure the SWOG is correct?
I think there are instances for the other Neoverse cores where 64-bit gather loads shows incorrect throughput when compared to the SWOG, eg: [this load](https://github.com/llvm/llvm-project/blob/b42851b8dda8c85a277573610519e8c66e91322f/llvm/test/tools/llvm-mca/AArch64/Neoverse/V3-sve-instructions.s#L4829C1-L4829C86) in V3.
https://github.com/llvm/llvm-project/pull/168572
More information about the llvm-commits
mailing list