[PATCH] D126515: [InstCombine] Fold memchr of sequences of same characters

Martin Sebor via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 31 15:10:35 PDT 2022


msebor added inline comments.


================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:1055
+      EndOff = Str.size();
+  }
+
----------------
nikic wrote:
> As far as I can tell EndOff doesn't actually get used later, it's only used for the substr(), so I'm not sure why this code is changing rather than only being moved.
> 
> I think in https://reviews.llvm.org/D123472 the wrong version of the patch ended up being committed, I remember there being a static substr() function for this purpose, but it doesn't seem to be present on main.
Yes, I must have committed a prior version of the patch.


================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:1066
+    //   N != 0 ? (S[0] == C ? S : S[Pos] == C ? S + Pos : null) : null
+    // where       ^Sel2    and    ^Sel1 are denoted above.
+    // The latter makes it also possible to fold strchr() calls with
----------------
nikic wrote:
> I don't think this is correct for the case where `N` is variable. Let's say we have `S = "12"`, `C = 2` and `N = 1` (known only at runtime). The correct result for this is `null`. I believe your code would return `S + 1`.
> 
> There doesn't seem to be test coverage for this case (variable N with two runs -- there's only one with three runs).
Good catch!  I've added the missing conditional and a test case for it.

(For what it's worth, plain C runtime tests would help spot these mistakes sooner.)


================
Comment at: llvm/test/Transforms/InstCombine/memchr-6.ll:19
+; CHECK-NEXT:    [[RET:%.*]] = call i8* @memchr(i8* noundef nonnull dereferenceable(1) getelementptr inbounds ([5 x i8], [5 x i8]* @a00000, i64 0, i64 0), i32 [[C:%.*]], i64 5)
+; CHECK-NEXT:    ret i8* [[RET]]
+;
----------------
nikic wrote:
> Why does this one not fold?
Because `a00000` has no initializer `getConstantDataArrayInfo` returns a null `Slice.Array` that `getConstantStringInfo` then fails for.  I've added a comment to the test.

I thought I'd handled it D125114 but apparently it needs another enhancement.  I can take care of it in a followup.


================
Comment at: llvm/test/Transforms/InstCombine/memchr-6.ll:53
+; CHECK-NEXT:    [[DOTNOT:%.*]] = icmp eq i64 [[N:%.*]], 0
+; CHECK-NEXT:    [[MEMCHR_SEL3:%.*]] = select i1 [[DOTNOT]], i8* null, i8* [[MEMCHR_SEL2]]
+; CHECK-NEXT:    ret i8* [[MEMCHR_SEL3]]
----------------
nikic wrote:
> Unrelated: There's a canonicalization opportunity here to convert `C1 ? X : (C2 ? X : Y) to `(C1 ? true : C2) ? X : Y` (or vice versa, but I'd consider logical or more canonical here).
I've reworked the conditional a bit and I'm not sure I see this opportunity there anymore but let me know if I missed it.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126515/new/

https://reviews.llvm.org/D126515



More information about the llvm-commits mailing list