[PATCH] D123631: [InstCombine] Fold memrchr calls with sequences of identical bytes.
Martin Sebor via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri May 6 16:47:11 PDT 2022
msebor added inline comments.
================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:962
+ // Truncate the string to search at most EndOff characters.
+ Str = Str.substr(0, EndOff);
+ if (Str.find_first_not_of(Str[0]) != StringRef::npos)
----------------
nikic wrote:
> Use the helper function that avoids the 32-bit truncation issue?
The issue can't happen here because `EndOff` is either restricted to at most `Str.size()` earlier on, or, for variable offsets, left initialized to `UINT64_MAX` and then implicitly converted to `SIZE_MAX` AKA `std::npos` when passed to `substr`.
================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:970
+ // AKA
+ // *S == C ? S + N - 1 : null.
+ Type *SizeTy = Size->getType();
----------------
nikic wrote:
> I don't get the AKA part of this comment.
When `N` is constant it's not zero at this point (constants zero and one are handled above). It's might be a vestige of the constant zero handling being done later in one of the early patches. Let me remove it.
================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:977
+ Value *CEqS0 = B.CreateICmpEQ(ConstantInt::get(Int8Ty, Str[0]), CharVal);
+ Value *And = B.CreateAnd(NNeZ, CEqS0);
+ Value *SizeM1 = B.CreateSub(Size, ConstantInt::get(SizeTy, 1));
----------------
nikic wrote:
> I think this should be using CreateLogicalAnd, in case `CharVal` is poison (unless we want to argue that this memrchr argument cannot be poison, even for a zero bound).
I don't have enough experience with `poison` to understand how it makes a difference but sure, `CreateLogicalAnd` sounds fine.
(Since `CharVal` is treated as an `unsigned char` by these functions, any value should be allowed here, including indeterminate/uninitialized. What exactly that means in practical C terms has been debated but the official albeit sometimes contested position is that each evaluation of an indeterminate `unsigned char` might yield a different result).
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D123631/new/
https://reviews.llvm.org/D123631
More information about the llvm-commits
mailing list