[PATCH] D154725: [AggressiveInstCombine] Fold strcmp for short string literals

Nikita Popov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 12 12:43:23 PDT 2023


nikic requested changes to this revision.
nikic added a comment.
This revision now requires changes to proceed.

Generally on board with the direction of doing this in AggressiveInstCombine.



================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:884
+  if (Func != LibFunc_sqrt && Func != LibFunc_sqrtf && Func != LibFunc_sqrtl)
+    return false;
+
----------------
Aren't these conditions already checked by the caller now?


================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:920
+
+  /// Trivial cases are optimized during inst combine
+  if (Str1P == Str2P)
----------------
`//` instead of `///`.


================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:928
+
+  bool Match = !HasStr1 && HasStr2 && Str2.size() == 1;
+  if (!Match)
----------------
We should also handle constant string on the LHS.


================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:942
+  // dst = phi(v0, v1)
+  //
+
----------------
As mentioned on the previous review, I believe you need to require that the strcmp result is only used in a comparison with zero. The exact return value of strcmp is unspecified, and we should not optimize to a result that deviates from the used libc implementation.


================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:949
+  Value *Str1FirstCharacterValue =
+      B.CreateIntCast(B.CreateLoad(B.getInt8Ty(), Str1P), B.getInt32Ty(), true);
+  Value *Str2FirstCharacterValue = ConstantInt::get(
----------------
Why the use of sign extension? The C standard says the comparison is performed on unsigned char, not signed char:

> The sign of a nonzero value returned by the comparison functions memcmp, strcmp, and strncmp
is determined by the sign of the difference between the values of the first pair of characters (both
interpreted as unsigned char) that differ in the objects being compared.


================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:992
+  if (!IsCallingConvC)
+    return false;
+
----------------
Why is this relevant?


================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:1003
+  if (CI->isMustTailCall() || CI->isNoTailCall())
+    return false;
+
----------------
Why is this relevant?


================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:1008
+
+  IRBuilder<> Builder(I.getParent());
+
----------------
This is only used by expandStrcmp(), I would sink it into there.


================
Comment at: llvm/test/Transforms/AggressiveInstCombine/strcmp.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; TODO: Test that ...
+; RUN: opt < %s -passes=aggressive-instcombine -S | FileCheck %s
----------------
TODO (If you write an extra comment, put it below RUN, not above.)


================
Comment at: llvm/test/Transforms/AggressiveInstCombine/strcmp.ll:51
+  ret i32 %call
+}
----------------
Should also test:
 * Comparison with empty string (not folded, because left to SLC).
 * Comparison with two known strings (not folded, because left to SLC).
 * Comparison with two-char string (not folded for now).
 * Comparison with known string on the LHS rather than RHS.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154725/new/

https://reviews.llvm.org/D154725



More information about the llvm-commits mailing list