[PATCH] D61160: [SimplifyCFG] Implement the suggested ctlz transform
James Molloy via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 26 01:44:26 PDT 2019
jmolloy requested changes to this revision.
jmolloy added a comment.
This revision now requires changes to proceed.
Thanks for implementing this! It's really cool.
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5492
+/// Another cheap transform: investigate using ctlz as a key to the switch
+/// TODO: this transform would benifit from proper range analysis in SwitchToLookupTable
----------------
s/Another/A. "Another" requires the user to know some context which has now gone.
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5493
+/// Another cheap transform: investigate using ctlz as a key to the switch
+/// TODO: this transform would benifit from proper range analysis in SwitchToLookupTable
+static bool ReduceSwitchRange_ctlz(SwitchInst *SI,
----------------
TODO -> FIXME
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5494
+/// TODO: this transform would benifit from proper range analysis in SwitchToLookupTable
+static bool ReduceSwitchRange_ctlz(SwitchInst *SI,
+ const DataLayout &DL,
----------------
No underscores in function names. ReduceSwitchRangeWithCtlz?
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5495
+static bool ReduceSwitchRange_ctlz(SwitchInst *SI,
+ const DataLayout &DL,
+ const TargetTransformInfo &TTI,
----------------
It looks to me like this isn't clang-formatted. Can you git-clang-format all your diffs?
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5496
+ const DataLayout &DL,
+ const TargetTransformInfo &TTI,
+ std::vector<uint64_t> *Values) {
----------------
You don't need DL or TTI here, please remove them.
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5497
+ const TargetTransformInfo &TTI,
+ std::vector<uint64_t> *Values) {
+ // There is a big gotcha reagarding ctlz/cttz: when input 0, they
----------------
I'd use a reference here. It's what most of the rest of LLVM uses.
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5498
+ std::vector<uint64_t> *Values) {
+ // There is a big gotcha reagarding ctlz/cttz: when input 0, they
+ // emit bit width. If not handled this can result in the transform
----------------
s/reagarding/regarding.
"when given input 0, they return the bitwidth" would be a more natural phrasing.
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5500
+ // emit bit width. If not handled this can result in the transform
+ // being applied multiple times. We can either rotate bit width back to zero,
+ // by doing an xor or the most significant bit, followed by a rotate-left 1,
----------------
I would avoid talking about the solution you didn't implement. Checking for zero input (which your popcount does) is a totally reasonable and understandable thing (and much more understandable than trying to grok the xor/rotate explanation to then find it isn't used :D )
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5503
+ // which is expensive, or we can just avoid situation, by checking the range.
+ unsigned BitWidth = cast<IntegerType>(SI->getCondition()->getType())->getIntegerBitWidth();
+ if (Values->back() <= BitWidth)
----------------
getIntegerBitWidth() is defined on Type. You can avoid the cast.
unsigned BitWidth = SI->getCondition()->getType()->getIntegerBitWidth()
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5515
+ return false;
+ // TODO a single instruction can be saved on x86 by using cttz over ctlz
+ // when only ctlz results in a subtraction instruction being added
----------------
This would be a pattern for instcombine to pick up. We shouldn't have to worry about this here.
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5519
+ V = APInt(BitWidth, V).countLeadingZeros();
+ // 0 will suddenly become the largest (BitWidth), so we need to sort again.
+ // Sorting is OK because we either make a transform on all numbers, or do
----------------
Isn't the order exactly reversed?
llvm::reverse(Values);
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5520
+ // 0 will suddenly become the largest (BitWidth), so we need to sort again.
+ // Sorting is OK because we either make a transform on all numbers, or do
+ // not make a transform at all.
----------------
And if you do need to sort, please use
llvm::sort(*Values)
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5553
// We organize the range to start from 0, if it is not already close.
- SmallVector<uint64_t,4> Values;
+ std::vector<uint64_t> Values;
for (auto &C : SI->cases())
----------------
Please keep using SmallVector.
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5556
Values.push_back(C.getCaseValue()->getValue().getLimitedValue());
- llvm::sort(Values);
+ std::sort(Values.begin(), Values.end());
----------------
Please keep using llvm::sort.
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5620
auto *Ty = cast<IntegerType>(SI->getCondition()->getType());
- Builder.SetInsertPoint(SI);
- auto *Sub = Builder.CreateSub(SI->getCondition(), ConstantInt::get(Ty, Base));
- Value *Key;
- if (Shift) {
- auto *ShiftC = ConstantInt::get(Ty, Shift);
- Function *Fshr = Intrinsic::getDeclaration(SI->getModule(), Intrinsic::fshr, Ty);
- Key = Builder.CreateCall(Fshr, {Sub, Sub, ShiftC});
- } else
- Key = Sub;
- SI->replaceUsesOfWith(SI->getCondition(), Key);
+ {
+ Builder.SetInsertPoint(SI);
----------------
Please remove the extra scope. There's no RAII objects here, so they aren't required.
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5624
+ if (UseCtlz) {
+ auto *Zero = ConstantInt::get(IntegerType::get(Ty->getContext(), 1), 0);
+ Function *Ctlz = Intrinsic::getDeclaration(SI->getModule(), Intrinsic::ctlz, Ty);
----------------
> IntegerType::get(Ty->getContext(), 1)
Change to
Builder.getInt1(false);
And please change "Zero" to a more descriptive name for the second parameter to ctlz.
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5632
+ auto *ShiftC = ConstantInt::get(Ty, Shift);
+ Function *Fshr = Intrinsic::getDeclaration(SI->getModule(), Intrinsic::fshr, Ty);
+ Key = Builder.CreateCall(Fshr, {Key, Key, ShiftC});
----------------
See comments on diffbase about fshr. I don't think you need it, the rotate sequence is good enough. Is there a good reason why you think fshr is better?
================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:5645
+ Zeros = Orig->getValue().getLimitedValue();
+ auto Sub = (APInt(BitWidth, Zeros) - Base).getLimitedValue();
Case.setValue(
----------------
Why are you converting from APInt to uint64_t to APInt again? Just remove .getLimitedValue() and keep Sub.lshr() in the next line?
================
Comment at: test/Transforms/SimplifyCFG/rangereduce.ll:85
; CHECK-NEXT: [[TMP1:%.*]] = sub i32 [[A:%.*]], 97
-; CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.fshr.i32(i32 [[TMP1]], i32 [[TMP1]], i32 0)
-; CHECK-NEXT: switch i32 [[TMP2]], label [[DEF:%.*]] [
----------------
Uh, something's fishy here - why was this fshr here in the first place? This switch just needs a subtract.
================
Comment at: test/Transforms/SimplifyCFG/rangereduce.ll:121
; CHECK-NEXT: [[TMP1:%.*]] = sub i32 [[A:%.*]], 97
-; CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.fshr.i32(i32 [[TMP1]], i32 [[TMP1]], i32 0)
-; CHECK-NEXT: switch i32 [[TMP2]], label [[DEF:%.*]] [
----------------
Your previous patch is clearly bogus; this is a shift by zero.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D61160/new/
https://reviews.llvm.org/D61160
More information about the llvm-commits
mailing list