[llvm-dev] [PATCH] strlen -> strnlen optimization

Michael McConville via llvm-dev llvm-dev at lists.llvm.org
Sat Feb 6 20:05:14 PST 2016


This addition converts strlen() calls to strnlen() when the result is
compared to a constant. For example, the following:

strlen(s) < 5

Becomes:

strnlen(s, 5) < 5

That way, we don't have to walk through the entire string. There is the
added overhead of maintaining a counter when using strnlen(), but I
thought I'd start with the general case. It may make sense to only use
this optimization with small constants. On the other hand, the impact of
the counter may be negligible in many or most cases due to
hardware-level optimizations.

I often notice the idiom of comparing a packet or buffer against a
minimum length as a sanity check. I was surprised to see that this isn't
optimized, so I thought I'd give it a try.

nlewycky on IRC seemed to think it was a good idea. I'm interested to
hear what others think.

I have little C++ experience, so there are likely improvements to be
made to my patch. I tried to adhere to the style and conventions of the
surrounding code.

This reintroduces emitStrNLen(), which was removed a couple months ago
in r253514. The only change is a getParent()->getParent() -->
getModule() conversion, which was done in emitStrLen() after
emitStrNLen() was removed.

This tests successfully for me on Ubuntu 14.04.3.

Thanks for your time,
Michael


Index: lib/Transforms/Utils/BuildLibCalls.cpp
===================================================================
--- lib/Transforms/Utils/BuildLibCalls.cpp	(revision 260011)
+++ lib/Transforms/Utils/BuildLibCalls.cpp	(working copy)
@@ -52,6 +52,28 @@
   return CI;
 }
 
+Value *llvm::emitStrNLen(Value *Ptr, Value *MaxLen, IRBuilder<> &B,
+                        const DataLayout &DL, const TargetLibraryInfo *TLI) {
+  if (!TLI->has(LibFunc::strlen))
+    return nullptr;
+
+  Module *M = B.GetInsertBlock()->getModule();
+  AttributeSet AS[2];
+  AS[0] = AttributeSet::get(M->getContext(), 1, Attribute::NoCapture);
+  Attribute::AttrKind AVs[2] = { Attribute::ReadOnly, Attribute::NoUnwind };
+  AS[1] = AttributeSet::get(M->getContext(), AttributeSet::FunctionIndex, AVs);
+
+  LLVMContext &Context = B.GetInsertBlock()->getContext();
+  Constant *StrNLen = M->getOrInsertFunction(
+      "strnlen", AttributeSet::get(M->getContext(), AS),
+      DL.getIntPtrType(Context), B.getInt8PtrTy(), DL.getIntPtrType(Context), nullptr);
+  CallInst *CI = B.CreateCall(StrNLen, {castToCStr(Ptr, B), MaxLen}, "strnlen");
+  if (const Function *F = dyn_cast<Function>(StrNLen->stripPointerCasts()))
+    CI->setCallingConv(F->getCallingConv());
+
+  return CI;
+}
+
 Value *llvm::emitStrChr(Value *Ptr, char C, IRBuilder<> &B,
                         const TargetLibraryInfo *TLI) {
   if (!TLI->has(LibFunc::strchr))
Index: lib/Transforms/Utils/SimplifyLibCalls.cpp
===================================================================
--- lib/Transforms/Utils/SimplifyLibCalls.cpp	(revision 260011)
+++ lib/Transforms/Utils/SimplifyLibCalls.cpp	(working copy)
@@ -555,6 +555,49 @@
   if (isOnlyUsedInZeroEqualityComparison(CI))
     return B.CreateZExt(B.CreateLoad(Src, "strlenfirst"), CI->getType());
 
+  // strlen(x) < y --> strnlen(x, y+1) < y
+  //
+  // We ensure that there is only one user to avoid interfering with
+  // CSE.
+  if (!CI->hasOneUse() || !TLI->has(LibFunc::strnlen))
+    return nullptr;
+  User *U = CI->user_back();
+  ICmpInst *IC;
+  if (!(IC = dyn_cast<ICmpInst>(U)))
+    return nullptr;
+  IntegerType *SizeType = DL.getIntPtrType(B.GetInsertBlock()->getContext());
+  Value *LHS = IC->getOperand(0), *RHS = IC->getOperand(1);
+
+  ConstantInt *Con;
+  if (!((Con = dyn_cast<ConstantInt>(LHS)) || (Con = dyn_cast<ConstantInt>(RHS))))
+    return nullptr;
+  uint64_t con_val = Con->getZExtValue();
+
+  if (RHS == CI)
+    IC->swapOperands();
+
+  switch (IC->getPredicate()) {
+  case ICmpInst::ICMP_EQ:
+  case ICmpInst::ICMP_NE:
+  case ICmpInst::ICMP_UGT:
+  case ICmpInst::ICMP_UGE:
+  case ICmpInst::ICMP_ULE:
+  case ICmpInst::ICMP_SGT:
+  case ICmpInst::ICMP_SGE:
+  case ICmpInst::ICMP_SLE:
+    // XXX: check for wrapping
+    if (con_val == UINT64_MAX)
+      return nullptr;
+    return emitStrNLen(Src, ConstantInt::get(SizeType, con_val + 1),
+		    B, DL, TLI);
+  case ICmpInst::ICMP_ULT:
+  case ICmpInst::ICMP_SLT:
+    return emitStrNLen(Src, Con,
+		    B, DL, TLI);
+  default:
+    return nullptr;
+  }
+
   return nullptr;
 }
 


More information about the llvm-dev mailing list