[PATCH] D122686: [InstCombine] Fold calls to strnlen (PR46455)

Martin Sebor via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 31 16:18:28 PDT 2022


msebor added a comment.

Thanks for the review!  Let me upload a new diff with the suggested changes.  I'm not sure what the process is for submitting the baseline tests: do I need to open a separate review for those?  Likewise for the `nonnull` attribute for `strnlen`?  I'm also not sure I have commit rights yet so if/when any of this is approved someone might need to commit the changes for me.

Finally, the build bot shows unit test failures that I'm not sure I understand yet and could use some guidance with.  Are ASan test timeouts expected?  Any idea about the BOLT failures?



================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:644
+      return B.CreateBinaryIntrinsic(Intrinsic::umin, Min, Bound);
+    };
+  }
----------------
nikic wrote:
> It would be fine to just emit the umin unconditionally here and let it fold away if it happens to be constant, something like:
> ```
> Value *Min = ConstantInt::get(CI->getType(), Len - 1);
> return [&]() {
>   return Bound ? B.CreateBinaryIntrinsic(Intrinsic::umin, Min, Bound) : Min;
> };
> ```
Done, thanks.


================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:672
         }
-        // If the string does not have '\0', leave it to strlen to compute
-        // its length.
-        if (NullTermIdx == ~((uint64_t)0))
+        if (NullTermIdx == ~((uint64_t)0) && MaxLen > Slice.Length)
+          // If the array does not have '\0', leave it to either strlen
----------------
nikic wrote:
> I had a bit of a hard time understanding this new special case. I'm wondering, if you have no null and use strnlen, can't we simply always return `Bound` as the result? Either the result will be Bound, or we're reading past the extent of the global and have UB. Or if we don't want to optimize the latter case we can keep the MaxLen < ArraySize check, but in either case I think mixing it with the code below is kind of confusing -- in particular the `sub` bit doesn't seem relevant?
That's a good question.  I was mostly trying to follow the `strlen` handling here.  But if there's no concern with the unterminated array being followed by another that does have a nul I'm okay with your suggestion.  The specific case I have in mind is an aggregate with two consecutive arrays only the second one of which is terminated, for example like so:
```
const char a[][4] = { "1234", "56" };

int f (void)
{
  return strnlen (a[0], sizeof a);
}
```
I see that aggregates aren't handled by the folder yet so this is only hypothetical at the moment.  But eventually I think there is an expectation to handle them and then the question of whether use cases like the above should be supported will need to be answered.


================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:710
+          // and let the caller simplify that better.
+          return nullptr;
+
----------------
nikic wrote:
> If we moved handling of the isOnlyUsedInZeroEqualityComparison() check in optimizeStringLength() before the minStringLength() handling, could we drop the special check here?
Yes, I think we can.  It doesn't seem like there are any cases when `minStringLength()` could fold away the memory load.


================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:768
     }
   }
 
----------------
nikic wrote:
> Might be more elegant with `m_ConstantInt`, something like...
> ```
> uint64_t MaxLen;
> if (!Bound || !match(Bound, m_ConstantInt(MaxLen))
>   MaxLen = UINT64_MAX;
> 
> if (MaxLen == 0)
>   return ConstantInt::get(CI->getType(), 0);
> ```
Done, thanks.


================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:777
+  if (!isOnlyUsedInZeroEqualityComparison(CI) ||
+      (Bound && !isKnownNonZero(Bound, DL) && MaxLen == UINT64_MAX))
+    // Avoid folding strnlen(x, N) == 0 unless N is known to be nonzero.
----------------
nikic wrote:
> Why does this need the `MaxLen == UINT64_MAX` check?
I've simplified/changed the conditional while moving it earlier.


================
Comment at: llvm/test/Transforms/InstCombine/strlen-4.ll:4
+;
+; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+
----------------
nikic wrote:
> Please use llvm/utils/update_test_checks.py to generate InstCombine test checks.
I have run the script to update the assertions.


================
Comment at: llvm/test/Transforms/InstCombine/strnlen-6.ll:53
+; CHECK-NEXT:    %ptr = load i8*, i8** @ecp
+; CHECK-NEXT     %len = call i64 @strnlen(i8* oundef nonnull dereferenceable(1) %ptr, i64 3)
+; CHECK-NEXT     ret i64 %len
----------------
nikic wrote:
> This test output looks wrong?
Yes, there are a couple of typos here, one keeping the other from causing the test to fail.  Good catch!  There were a few others just like that in a couple of tests. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122686/new/

https://reviews.llvm.org/D122686



More information about the llvm-commits mailing list