[llvm-dev] [PATCH] strlen -> strnlen optimization

Mehdi Amini via llvm-commits llvm-commits at lists.llvm.org
Sun Feb 7 10:45:12 PST 2016



Sent from my iPhone

> On Feb 6, 2016, at 11:40 PM, Michael McConville <mmcco at mykolab.com> wrote:
> 
> Mehdi Amini wrote:
>> Also it seems that your patch was inline in the email and not
>> attached? I couldn't apply it for some reason.
> 
> I see. I mostly work with projects that use inline patches (BSDs) so I
> thought that was the preferred method. I'm surprised to hear that it
> wouldn't apply because I'm using Mutt. Anyway, I'll attach in the
> future.

Oh I thon kit is probably *my* client, and since we don't use much inline diff here, I'm not use to workaround it.

> 
> A couple responses while I have time:
> 
>>> Index: lib/Transforms/Utils/SimplifyLibCalls.cpp
>>> ===================================================================
>>> --- lib/Transforms/Utils/SimplifyLibCalls.cpp    (revision 260011)
>>> +++ lib/Transforms/Utils/SimplifyLibCalls.cpp    (working copy)
>>> @@ -555,6 +555,49 @@
>>>  if (isOnlyUsedInZeroEqualityComparison(CI))
>>>    return B.CreateZExt(B.CreateLoad(Src, "strlenfirst"), CI->getType());
>>> 
>>> +  // strlen(x) < y --> strnlen(x, y+1) < y
>>> +  //
>>> +  // We ensure that there is only one user to avoid interfering with
>>> +  // CSE.
>>> +  if (!CI->hasOneUse() || !TLI->has(LibFunc::strnlen))
>>> +    return nullptr;
>> 
>> You are adding this optimization inside a function that try multiple
>> optimizations in sequence. An early return is not nice in this context
>> because it prevent from adding other transformation in the same
>> function afterwards. The cleanest way to keep early exits and avoid
>> deep nesting is to extract this transformation in a static helper
>> function.
> 
> Right - I meant to ask about this. I recognized the issue with early
> returns, but the nesting quickly got out of hand. I'll break it out into
> a helper function.
> 
> 
>>> +  User *U = CI->user_back();
>>> +  ICmpInst *IC;
>>> +  if (!(IC = dyn_cast<ICmpInst>(U)))
>>> +    return nullptr;
>> 
>> ICmpInst *IC = dyn_cast<ICmpInst>(U);
>> if (!IC)
>>   return nullptr;
> 
> Good catch, will change.
> 
>>> +  IntegerType *SizeType = DL.getIntPtrType(B.GetInsertBlock()->getContext());
>>> +  Value *LHS = IC->getOperand(0), *RHS = IC->getOperand(1);
>>> +
>>> +  ConstantInt *Con;
>>> +  if (!((Con = dyn_cast<ConstantInt>(LHS)) || (Con = dyn_cast<ConstantInt>(RHS))))
>>> +    return nullptr;
>> 
>> I think you can assume the constant to be on the right (and remove the
>> swapOperands):
>> 
>> ConstantInt *Con = dyn_cast<ConstantInt>(IC->getOperand(1));
>> if(!Con)
>>  return nullptr;
> 
> Are you saying that LLVM will always position constants on the right? If
> not, I think it'd be unfortunate to miss instances like 5 == strlen(s).
> I know that some projects specify that ordering in their style guides to
> prevent = vs. == bugs.

Yes instcombine will canonicalize constant operands to the right (IIRC).


> 
>>> +  uint64_t con_val = Con->getZExtValue();
>>> +
>>> +  if (RHS == CI)
>>> +    IC->swapOperands();
>>> +
>>> +  switch (IC->getPredicate()) {
>>> +  case ICmpInst::ICMP_EQ:
>>> +  case ICmpInst::ICMP_NE:
>>> +  case ICmpInst::ICMP_UGT:
>>> +  case ICmpInst::ICMP_UGE:
>>> +  case ICmpInst::ICMP_ULE:
>>> +  case ICmpInst::ICMP_SGT:
>>> +  case ICmpInst::ICMP_SGE:
>>> +  case ICmpInst::ICMP_SLE:
>>> +    // XXX: check for wrapping
>>> +    if (con_val == UINT64_MAX)
>>> +      return nullptr;
>> 
>> ISTM that the wrapping check assumes that "sizeof(size_t) == 8", which
>> is not always true.
> 
> I guess I was thinking only of overflow in the compiler itself.
> Specifically, overflow of that uint64_t value. Is the range of size_t
> vs. the size of str(n)len's argument not checked downstream?

I have to think about that :)

-- 
Mehdi
> 
> If not, we could also just check against UINT32_MAX. Then, it seems that
> the worst case would be a false positive when the constant is >= 2^31 -
> 1 and the target is 64-bit. In that case, it's probably ok to leave it
> as a strlen call anyway. :-)
> 
> Let me know if I'm misunderstanding.
> 
> Thanks,
> Michael


More information about the llvm-commits mailing list