[llvm-dev] [PATCH] strlen -> strnlen optimization
Mehdi Amini via llvm-dev
llvm-dev at lists.llvm.org
Sat Feb 6 22:47:05 PST 2016
Hi,
(llvm-dev to BCC)
Thanks for this!
Patches are supposed to be sent to llvm-commits and not llvm-dev (http://llvm.org/docs/DeveloperPolicy.html ).
Also it seems that your patch was inline in the email and not attached? I couldn't apply it for some reason.
First off, you're missing test cases for each ICMP possibility, and one for the wrapping case (see test/Transforms/InstCombine/simplify-libcalls.ll if you need examples).
Also, see other comments inline below.
> On Feb 6, 2016, at 8:05 PM, Michael McConville via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> This addition converts strlen() calls to strnlen() when the result is
> compared to a constant. For example, the following:
>
> strlen(s) < 5
>
> Becomes:
>
> strnlen(s, 5) < 5
>
> That way, we don't have to walk through the entire string. There is the
> added overhead of maintaining a counter when using strnlen(), but I
> thought I'd start with the general case. It may make sense to only use
> this optimization with small constants. On the other hand, the impact of
> the counter may be negligible in many or most cases due to
> hardware-level optimizations.
>
> I often notice the idiom of comparing a packet or buffer against a
> minimum length as a sanity check. I was surprised to see that this isn't
> optimized, so I thought I'd give it a try.
>
> nlewycky on IRC seemed to think it was a good idea. I'm interested to
> hear what others think.
>
> I have little C++ experience, so there are likely improvements to be
> made to my patch. I tried to adhere to the style and conventions of the
> surrounding code.
>
> This reintroduces emitStrNLen(), which was removed a couple months ago
> in r253514. The only change is a getParent()->getParent() -->
> getModule() conversion, which was done in emitStrLen() after
> emitStrNLen() was removed.
>
> This tests successfully for me on Ubuntu 14.04.3.
>
> Thanks for your time,
> Michael
>
>
> Index: lib/Transforms/Utils/BuildLibCalls.cpp
> ===================================================================
> --- lib/Transforms/Utils/BuildLibCalls.cpp (revision 260011)
> +++ lib/Transforms/Utils/BuildLibCalls.cpp (working copy)
> @@ -52,6 +52,28 @@
> return CI;
> }
>
> +Value *llvm::emitStrNLen(Value *Ptr, Value *MaxLen, IRBuilder<> &B,
> + const DataLayout &DL, const TargetLibraryInfo *TLI) {
Interestingly it was removed from the implementation but not from the header...
> + if (!TLI->has(LibFunc::strlen))
> + return nullptr;
> +
> + Module *M = B.GetInsertBlock()->getModule();
> + AttributeSet AS[2];
> + AS[0] = AttributeSet::get(M->getContext(), 1, Attribute::NoCapture);
> + Attribute::AttrKind AVs[2] = { Attribute::ReadOnly, Attribute::NoUnwind };
> + AS[1] = AttributeSet::get(M->getContext(), AttributeSet::FunctionIndex, AVs);
> +
> + LLVMContext &Context = B.GetInsertBlock()->getContext();
> + Constant *StrNLen = M->getOrInsertFunction(
> + "strnlen", AttributeSet::get(M->getContext(), AS),
> + DL.getIntPtrType(Context), B.getInt8PtrTy(), DL.getIntPtrType(Context), nullptr);
> + CallInst *CI = B.CreateCall(StrNLen, {castToCStr(Ptr, B), MaxLen}, "strnlen");
> + if (const Function *F = dyn_cast<Function>(StrNLen->stripPointerCasts()))
> + CI->setCallingConv(F->getCallingConv());
> +
> + return CI;
> +}
> +
> Value *llvm::emitStrChr(Value *Ptr, char C, IRBuilder<> &B,
> const TargetLibraryInfo *TLI) {
> if (!TLI->has(LibFunc::strchr))
> Index: lib/Transforms/Utils/SimplifyLibCalls.cpp
> ===================================================================
> --- lib/Transforms/Utils/SimplifyLibCalls.cpp (revision 260011)
> +++ lib/Transforms/Utils/SimplifyLibCalls.cpp (working copy)
> @@ -555,6 +555,49 @@
> if (isOnlyUsedInZeroEqualityComparison(CI))
> return B.CreateZExt(B.CreateLoad(Src, "strlenfirst"), CI->getType());
>
> + // strlen(x) < y --> strnlen(x, y+1) < y
> + //
> + // We ensure that there is only one user to avoid interfering with
> + // CSE.
> + if (!CI->hasOneUse() || !TLI->has(LibFunc::strnlen))
> + return nullptr;
You are adding this optimization inside a function that try multiple optimizations in sequence. An early return is not nice in this context because it prevent from adding other transformation in the same function afterwards. The cleanest way to keep early exits and avoid deep nesting is to extract this transformation in a static helper function.
> + User *U = CI->user_back();
> + ICmpInst *IC;
> + if (!(IC = dyn_cast<ICmpInst>(U)))
> + return nullptr;
ICmpInst *IC = dyn_cast<ICmpInst>(U);
if (!IC)
return nullptr;
> + IntegerType *SizeType = DL.getIntPtrType(B.GetInsertBlock()->getContext());
> + Value *LHS = IC->getOperand(0), *RHS = IC->getOperand(1);
> +
> + ConstantInt *Con;
> + if (!((Con = dyn_cast<ConstantInt>(LHS)) || (Con = dyn_cast<ConstantInt>(RHS))))
> + return nullptr;
I think you can assume the constant to be on the right (and remove the swapOperands):
ConstantInt *Con = dyn_cast<ConstantInt>(IC->getOperand(1));
if(!Con)
return nullptr;
> + uint64_t con_val = Con->getZExtValue();
> +
> + if (RHS == CI)
> + IC->swapOperands();
> +
> + switch (IC->getPredicate()) {
> + case ICmpInst::ICMP_EQ:
> + case ICmpInst::ICMP_NE:
> + case ICmpInst::ICMP_UGT:
> + case ICmpInst::ICMP_UGE:
> + case ICmpInst::ICMP_ULE:
> + case ICmpInst::ICMP_SGT:
> + case ICmpInst::ICMP_SGE:
> + case ICmpInst::ICMP_SLE:
> + // XXX: check for wrapping
> + if (con_val == UINT64_MAX)
> + return nullptr;
ISTM that the wrapping check assumes that "sizeof(size_t) == 8", which is not always true.
Best,
Mehdi
> + return emitStrNLen(Src, ConstantInt::get(SizeType, con_val + 1),
> + B, DL, TLI);
> + case ICmpInst::ICMP_ULT:
> + case ICmpInst::ICMP_SLT:
> + return emitStrNLen(Src, Con,
> + B, DL, TLI);
> + default:
> + return nullptr;
> + }
> +
> return nullptr;
> }
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
More information about the llvm-dev
mailing list