[PATCH] D30680: new method TargetTransformInfo::supportsVectorElementLoadStore() for LoopVectorizer
Adam Nemet via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 4 21:11:41 PDT 2017
anemet added a comment.
In https://reviews.llvm.org/D30680#713835, @jonpa wrote:
> In https://reviews.llvm.org/D30680#713268, @anemet wrote:
>
> > Sorry about the delay on this but I was working on something related for ARM that may benefit from this as well. What I need for ARM is something that can communicate to the SLPVectorizer that load-pair and store-pair (of two registers) is efficiently supported on the target. I am wondering if we can combine the two things if your new hook would take the type and the vectorization width.
> >
> > What do you think?
>
>
> Is this also in the context of scalarizing a load / store?
>
> For SystemZ, a scalarized memory access will have to do VF memory operations, but there is no need to extract or insert any of the data elements, as there are vector element load/store instructions.
We have something like this on ARM too. ld1 can load any element of a vector (e.g. ld1.s {v1}[1], [x1] loads lane 1 of vector reg v1) and st1 can store any element. That said, ld1 is still a partial write of the vector register so in terms of performance, it's worse than a regular store which is a full write. I think that modeling its cost as a load + insert (for non-zero-lane) is fairly accurate. Doesn't this match the situation on SystemZ?
https://reviews.llvm.org/D30680
More information about the llvm-commits
mailing list