[PATCH] D129224: [InstCombine] Fold strtoul and strtoull and avoid PR #56293

Eli Friedman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 12 10:26:14 PDT 2022


efriedma added inline comments.


================
Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:117
+      if (Str[0] == '0') {
+        if (toUpper((unsigned char)Str[1]) == 'X') {
+          Base = 16;
----------------
msebor wrote:
> efriedma wrote:
> > The standard requires that a "0x" prefix be followed by a hexadecimal digit.  Similarly, the standard requires that a "0" prefix be followed by an octal digit.  If there isn't an appropriate digit after, the subject sequence (the part that's actually parsed as a number) is just a decimal "0".  And the rest is treated as "unrecognized characters".
> > 
> > The standard allows a "0x" prefix if the base is explicitly specified to be 16.
> Whoops, yes, we need to consume the hex prefix even in base 16.  Thanks for pointing out this mistake!
> 
> I'm aware of what the C and POSIX standards require but I'm not sure I understand the first part of your comment, but just for clarity: 1) a lone zero is valid in any base (i.e., it doesn't need to be followed by other digits), and 2) the function fails if the whole subject sequence isn't consumed (i.e., has "unrecognized characters").
I think strictly interpreting the standard, if you have something like "0xg", or "09", they should also parse to "0".  But if we're rejecting them anyway, I guess that's fine... might want to add a comment somewhere to clarify why we're rejecting strings with unrecognized trailing characters.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129224/new/

https://reviews.llvm.org/D129224



More information about the llvm-commits mailing list