[PATCH] D122686: [InstCombine] Fold calls to strnlen (PR46455)

Nikita Popov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 1 08:26:58 PDT 2022


nikic added a comment.

In D122686#3420785 <https://reviews.llvm.org/D122686#3420785>, @msebor wrote:

> Thanks for the review!  Let me upload a new diff with the suggested changes.  I'm not sure what the process is for submitting the baseline tests: do I need to open a separate review for those?  Likewise for the `nonnull` attribute for `strnlen`?  I'm also not sure I have commit rights yet so if/when any of this is approved someone might need to commit the changes for me.

For baseline tests, you can just commit them directly. As you don't have commit access yet, I went ahead and committed them for you. But otherwise yes, the usual pattern is to create a patch stack (with "Depends on DNNN") where each patch adds one transform. That makes it easier to review and bisect problems later. (Though there may be some reasonable disagreement on how fine-grained changes should be...)

> Finally, the build bot shows unit test failures that I'm not sure I understand yet and could use some guidance with.  Are ASan test timeouts expected?  Any idea about the BOLT failures?

Unfortunately the pre-merge checks have been pretty flaky lately. If the test times out, you can generally assume it's a false positive. Generally, if it doesn't look obviously related to the area you changed, it's probably a false positive.



================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:747
+      return B.CreateBinaryIntrinsic(Intrinsic::umin, Res, Bound);
+    };
   }
----------------
This is redundant: You already computed the bounded string lengths for both branches above, so there should be no need to bound the result here again.


================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:672
         }
-        // If the string does not have '\0', leave it to strlen to compute
-        // its length.
-        if (NullTermIdx == ~((uint64_t)0))
+        if (NullTermIdx == ~((uint64_t)0) && MaxLen > Slice.Length)
+          // If the array does not have '\0', leave it to either strlen
----------------
msebor wrote:
> nikic wrote:
> > I had a bit of a hard time understanding this new special case. I'm wondering, if you have no null and use strnlen, can't we simply always return `Bound` as the result? Either the result will be Bound, or we're reading past the extent of the global and have UB. Or if we don't want to optimize the latter case we can keep the MaxLen < ArraySize check, but in either case I think mixing it with the code below is kind of confusing -- in particular the `sub` bit doesn't seem relevant?
> That's a good question.  I was mostly trying to follow the `strlen` handling here.  But if there's no concern with the unterminated array being followed by another that does have a nul I'm okay with your suggestion.  The specific case I have in mind is an aggregate with two consecutive arrays only the second one of which is terminated, for example like so:
> ```
> const char a[][4] = { "1234", "56" };
> 
> int f (void)
> {
>   return strnlen (a[0], sizeof a);
> }
> ```
> I see that aggregates aren't handled by the folder yet so this is only hypothetical at the moment.  But eventually I think there is an expectation to handle them and then the question of whether use cases like the above should be supported will need to be answered.
I don't think getConstantDataArrayInfo() could return true for such a case, it needs to return the string up to the end of the global. Though now that I look at the implementation, it does support an offset into the global, so if the GEP has a negative index it could move the start of the search before a null byte, and we'd get something less than the bound in that case.

I'm having a hard time convincing myself of the correctness of the code as implemented -- I'd suggest to put the entire dynamic GEP handling behind a `!Bound` check for the initial patch and revisit this afterwards.


================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:710
+          // and let the caller simplify that better.
+          return nullptr;
+
----------------
msebor wrote:
> nikic wrote:
> > If we moved handling of the isOnlyUsedInZeroEqualityComparison() check in optimizeStringLength() before the minStringLength() handling, could we drop the special check here?
> Yes, I think we can.  It doesn't seem like there are any cases when `minStringLength()` could fold away the memory load.
Drop the check now?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122686/new/

https://reviews.llvm.org/D122686



More information about the llvm-commits mailing list