[PATCH] D120880: [AArch64] Insert subvector costs
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 22 02:27:17 PDT 2022
dmgreen added a comment.
In D120880#3363347 <https://reviews.llvm.org/D120880#3363347>, @labrinea wrote:
> Do we care about vector alignment? I am just looking at the x86 code, which does. Also, does this change apply to subvector extract too?
By vector alignment - do you mean the alignment of the subvector inserted into the larger vector? If so - yep. We can insert aligned elements (a v4i8 inserted into a v16i8 is just an f32 lane mov providing the lane % 4 == 0), but not unaligned values.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2721
+ TLI->getTypeLegalizationCost(DL, SubTp);
+ if (SubLT.second.isVector()) {
+ int NumSubElts = SubLT.second.getVectorNumElements();
----------------
sdesmalen wrote:
> I think this code silently assumes fixed-width == NEON, so I think it's worth adding a check that vector type being inserted into is less than 128bits. (and something similar for the value of the index being within a certain range). For fixed-length SVE vectors where wider vectors are legal, these ranges may go beyond 128bits.
>
> It would also be useful to have a test for that case.
It was intending to work on larger-than-legal types. As in a v16i8 inserted (aligned) into a v32i8 isn't really an insert after all, it's just legalized to being a different register, and should be cheap as a result. llvm shuffles can always be larger than legal types. I can add a check that the legal type is at most 128bits.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D120880/new/
https://reviews.llvm.org/D120880
More information about the llvm-commits
mailing list