[PATCH] D109289: [InstCombine] snprintf(NULL, 0, "%s", str) to strlen(str)

Bjorn Pettersson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 1 08:44:55 PDT 2023


bjope added inline comments.
Herald added a project: All.


================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:2597
+        if (auto *V = emitStrLen(CI->getArgOperand(3), B, DL, TLI)) {
+          if (CI->getType() != V->getType())
+            V = B.CreateTrunc(V, CI->getType());
----------------
xbolva00 wrote:
> bjope wrote:
> > bjope wrote:
> > > bjope wrote:
> > > > xbolva00 wrote:
> > > > > snprintf returns int, strlen returns size_t so...  trunc should be fine?
> > > > Isn't the transform only valid if the number of bits in CI is greater than the number of bits in V?
> > > > I.e. the unsigned value returned by strlen must fit into the non-negative part of the signed destination type. And given that constraint we always need the trunc.
> > > Well, given that constraint we always need a zext, not trunc.
> > If we consider the case when the value returned by strlen as size_t would be out-of-range for the non-negative range of int, then I guess snprintf should return a negative value? Or is the result undefined (I haven't seen any documentation saying anything about such cases for snprintf, while for strlen it says that the result in undefined if no null-termination is found)?
> > 
> > Either way it would be ok to do the transformation when sizeof(size_t)==sizeof(int), because that would give a negative value if the strlen result is out-of-range for the int.
> > 
> > But if sizeof(size_t)>sizeof(int), then maybe we need to saturate the value instead of doing a trunc, just to be safe?
> > 
> zext? We need change from i64 (from newly added strlen) to i32 (return type of snprintf)
I was looking through old phabricator reviews still open, and tried to understand this conversation :-)

(And I think that I might have mixed up which function that returned int and size_t in some comments.)

What I've found out since last time is that, as described here https://www.austingroupbugs.net/view.php?id=761, the C standard (C11?) seems to have clarified that behavior is undefined if "The number of characters or wide characters transmitted by a formatted output function (or written to an array, or that would have been written to an array) is greater than INT_MAX (7.21.6.1, 7.29.2.1).".

And IIUC the snprintf would never return a negative value indicating conversion error for %s.

So it seems like we can assume that the snprintf return a value in the range [0, INT_MAX] or else it would be undefined.

Then I think we either need to check that `sizeof(size_t) >= sizeof(int)`  (e.g. by comparing the type size for CI and V), to ensure that trunc is correct choice when the sizes aren't equal. If  `sizeof(size_t) < sizeof(int)` (which probably is more unlikely) we could just skip doing the optimization (or we could make sure there will be a zext, e.g. by using `CreateIntCast(V, CI->getType(), false)`).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109289/new/

https://reviews.llvm.org/D109289



More information about the llvm-commits mailing list