[PATCH] D128342: [LoopVectorize] Disable tail-folding when masked interleaved accesses are unavailable
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 28 08:10:44 PDT 2022
dmgreen added a comment.
In terms of MVE - A VLD2/VLD4 cannot be predicated so in that regards we do not support "MaskedInterleavedAccesses". There is code in canTailPredicateLoop that attempts to get that right. Any other interleaving group width will be emulated with a gather/scatter though, which can happily be masked.
https://godbolt.org/z/KzvEqz439
For SVE my understanding is that LD2/LD3/LD4 can be predicated, and other widths (and current codegen as interleaving is not yet supported) will use gather/scatter which can be masked. In the long run they may have MaskedInterleavedAccesses returning true.
Is the problem that we are trying to use one variable for both Neon and SVE vectorization, where SVE prefers folding the tail, and NEON will need not to?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D128342/new/
https://reviews.llvm.org/D128342
More information about the llvm-commits
mailing list