[PATCH] D155299: [AArch64][SVE2] Combine add+lsr to rshrnb for stores
David Sherwood via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 17 00:43:51 PDT 2023
david-arm added a comment.
This is a nice optimisation @MattDevereau, thanks! I found there is also another case we could support with loops like this where the store doesn't come straight afterwards:
void foo(unsigned short *dest, unsigned short *src, long n) {
for (long i = 0; i < n; i++)
dest[i] += ((src[i] + 32) >> 6);
}
In this case the IR sequence is add, lshr, trunc since the truncate doesn't get absorbed into the store. Maybe it's worth seeing if you can reuse your code in `tryCombineStoredNarrowShift` for this case too?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D155299/new/
https://reviews.llvm.org/D155299
More information about the llvm-commits
mailing list