[all-commits] [llvm/llvm-project] 81a11d: [CGP, AArch64] Replace zexts with shuffle that can ...
Florian Hahn via All-commits
all-commits at lists.llvm.org
Thu Sep 15 11:18:58 PDT 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 81a11da762577b000e615a874dc56eb927ff2c1b
https://github.com/llvm/llvm-project/commit/81a11da762577b000e615a874dc56eb927ff2c1b
Author: Florian Hahn <flo at fhahn.com>
Date: 2022-09-15 (Thu, 15 Sep 2022)
Changed paths:
M llvm/include/llvm/CodeGen/TargetLowering.h
M llvm/lib/CodeGen/CodeGenPrepare.cpp
M llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
M llvm/lib/Target/AArch64/AArch64ISelLowering.h
M llvm/test/CodeGen/AArch64/vselect-ext.ll
M llvm/test/CodeGen/AArch64/zext-to-tbl.ll
M llvm/test/Transforms/CodeGenPrepare/AArch64/zext-to-shuffle.ll
Log Message:
-----------
[CGP,AArch64] Replace zexts with shuffle that can be lowered using tbl.
This patch extends CodeGenPrepare to lower zext v16i8 -> v16i32 in loops
using a wide shuffle creating a v64i8 vector, selecting groups of 3
zero elements and an element from the input.
This is profitable on AArch64 where such shuffles can be lowered to tbl
instructions, but only in loops, because it requires materializing 4
masks, which can be done in the loop preheader.
This is the only reason the transform is part of CGP. If there's a
better alternative I missed, please let me know. The same goes for the
shouldReplaceZExtWithShuffle hook which guards this. I am not sure if
this transform will be beneficial on other targets, but it seems like
there is no way other convenient way.
This improves the generated code for loops like the one below in
combination with D96522.
int foo(uint8_t *p, int N) {
unsigned long long sum = 0;
for (int i = 0; i < N ; i++, p++) {
unsigned int v = *p;
sum += (v < 127) ? v : 256 - v;
}
return sum;
}
https://clang.godbolt.org/z/Wco866MjY
Reviewed By: t.p.northover
Differential Revision: https://reviews.llvm.org/D120571
More information about the All-commits
mailing list