[PATCH] D113291: [WIP] Add LowerTableBasedCTZ and anable it for AARCH64 in -O3
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 5 09:15:14 PDT 2021
craig.topper added inline comments.
================
Comment at: llvm/lib/Transforms/Scalar/LowerTableBasedCTZ.cpp:229
+ uint64_t MulConst, ShiftConst;
+ // Aarch64 has i64 for getelementpointer index so this match will probably
+ // fail for other targets.
----------------
Seems like this should be a TODO or FIXME or NOTE to make it more noticeable
================
Comment at: llvm/lib/Transforms/Scalar/LowerTableBasedCTZ.cpp:264
+
+ bool DefinedForZero = ConstData->getElementAsInteger(0) == InputBits;
+
----------------
If the value in element 0 isn't InputBits, don't you still need to make the lowering produce the same value as the table would have? For the example, in the bug report the value is 0 and the suggested lowering was ctlz(x) & 0x1f.
================
Comment at: llvm/lib/Transforms/Scalar/LowerTableBasedCTZ.cpp:267
+ IRBuilder<> B(LI);
+ ConstantInt *BoolConst = DefinedForZero ? B.getTrue() : B.getFalse();
+ auto Cttz = B.CreateIntrinsic(Intrinsic::cttz, {ArgXType}, {ArgX, BoolConst});
----------------
The polarity of this seems backwards. The second argument to ctlz intrinsic is called "zero_undef". It should be true if you don't care what value is produced for 0.
================
Comment at: llvm/test/Transforms/LowerTableBasedCTZ/AARCH64/lower-table-based-ctz-basics.ll:125
+ %idxprom = zext i32 %shr to i64
+ %arrayidx = getelementptr inbounds [32 x i8], [32 x i8]* @ctz7.table, i64 0, i64 %idxprom
+ %0 = load i8, i8* %arrayidx, align 1, !tbaa !8
----------------
Why does this table name not match the function name?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D113291/new/
https://reviews.llvm.org/D113291
More information about the llvm-commits
mailing list