[PATCH] D136722: [AArch64] Extending lowering of 'zext <Y x i8> %x to <Y x i8X>' to use tbl instructions
NILANJANA BASU via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 21 13:53:31 PST 2022
nilanjana_basu added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:13703
auto *UIToFP = dyn_cast<UIToFPInst>(I);
- if (UIToFP &&
- (SrcTy->getNumElements() == 8 || SrcTy->getNumElements() == 16) &&
- SrcTy->getElementType()->isIntegerTy(8) &&
+ if (UIToFP && SrcTy->getElementType()->isIntegerTy(8) &&
DstTy->getElementType()->isFloatTy()) {
----------------
This conversion shows a regression in performance for some cases where there are multiple similar zext instructions present back to back. The generated code with the previous implementation could be folded into a more optimized set of instructions, which is not possible with 'tbl' instructions. One example is 16xi8->16xi64, where I find an increase in the number of instructions after being lowered to tbl on using a loop interleave count of 4, i.e. with 4 back to back zext instructions.
Is it better to rule out this case in this 'if' block or should we not allow tbl-lowering when there are multiple zext instructions of the same type present back to back?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D136722/new/
https://reviews.llvm.org/D136722
More information about the llvm-commits
mailing list