[PATCH] D122686: [InstCombine] Fold calls to strnlen (PR46455)

Martin Sebor via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 6 09:30:26 PDT 2022


msebor added inline comments.


================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:747
+      return B.CreateBinaryIntrinsic(Intrinsic::umin, Res, Bound);
+    };
   }
----------------
nikic wrote:
> This is redundant: You already computed the bounded string lengths for both branches above, so there should be no need to bound the result here again.
I agree the `umin` is not necessary.  At the same time, removing it makes the code look like it's missing.  It should get folded away, but I don't have a strong preference one way or the other.


================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:770
+    // Fold strlen:
+    //   strlen(x) != 0 --> *x != 0
+    //   strlen(x) == 0 --> *x == 0
----------------
xbolva00 wrote:
> This is an opportunity to extend this opt more
> 
> see https://github.com/llvm/llvm-project/issues/48940
> 
Let me look into it, thanks for the pointer!


================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:672
         }
-        // If the string does not have '\0', leave it to strlen to compute
-        // its length.
-        if (NullTermIdx == ~((uint64_t)0))
+        if (NullTermIdx == ~((uint64_t)0) && MaxLen > Slice.Length)
+          // If the array does not have '\0', leave it to either strlen
----------------
nikic wrote:
> msebor wrote:
> > nikic wrote:
> > > I had a bit of a hard time understanding this new special case. I'm wondering, if you have no null and use strnlen, can't we simply always return `Bound` as the result? Either the result will be Bound, or we're reading past the extent of the global and have UB. Or if we don't want to optimize the latter case we can keep the MaxLen < ArraySize check, but in either case I think mixing it with the code below is kind of confusing -- in particular the `sub` bit doesn't seem relevant?
> > That's a good question.  I was mostly trying to follow the `strlen` handling here.  But if there's no concern with the unterminated array being followed by another that does have a nul I'm okay with your suggestion.  The specific case I have in mind is an aggregate with two consecutive arrays only the second one of which is terminated, for example like so:
> > ```
> > const char a[][4] = { "1234", "56" };
> > 
> > int f (void)
> > {
> >   return strnlen (a[0], sizeof a);
> > }
> > ```
> > I see that aggregates aren't handled by the folder yet so this is only hypothetical at the moment.  But eventually I think there is an expectation to handle them and then the question of whether use cases like the above should be supported will need to be answered.
> I don't think getConstantDataArrayInfo() could return true for such a case, it needs to return the string up to the end of the global. Though now that I look at the implementation, it does support an offset into the global, so if the GEP has a negative index it could move the start of the search before a null byte, and we'd get something less than the bound in that case.
> 
> I'm having a hard time convincing myself of the correctness of the code as implemented -- I'd suggest to put the entire dynamic GEP handling behind a `!Bound` check for the initial patch and revisit this afterwards.
Okay, let me break up the patch then and resubmit.


================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:710
+          // and let the caller simplify that better.
+          return nullptr;
+
----------------
nikic wrote:
> msebor wrote:
> > nikic wrote:
> > > If we moved handling of the isOnlyUsedInZeroEqualityComparison() check in optimizeStringLength() before the minStringLength() handling, could we drop the special check here?
> > Yes, I think we can.  It doesn't seem like there are any cases when `minStringLength()` could fold away the memory load.
> Drop the check now?
Right, will do.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122686/new/

https://reviews.llvm.org/D122686



More information about the llvm-commits mailing list