[libc-commits] [PATCH] D113036: [libc] refactor atof string parsing

Fri Nov 5 15:34:46 PDT 2021

michaelrj added inline comments.

================
Comment at: libc/src/__support/str_to_float.h:459
+static inline void
+binaryExpToFloat(typename fputil::FPBits<T>::UIntType mantissa, int32_t exp2,
+                 typename fputil::FPBits<T>::UIntType *outputMantissa,
----------------
sivachandra wrote:
> This function is mostly `NormalFloat::operator T`. It would be nice if we can extend it as required and avoid the mostly duplicated logic. AFAICT, setting of  `errno` is the difference? Input exponent needs to be corrected of course like on line 480, which can be done as a preprocessing step. You can do it as a cleanup later if it can be done.
I think it's possible. What I'd probably want to do for that is add a way to initialize a NormalFloat with a mantissa that uses all the bits of a `UIntType`. Then it would be a relatively trivial process.

================
Comment at: libc/src/__support/str_to_float.h:500
+    }
+    mantissa = shiftRightAndRound(mantissa, amountToShift);
+    if (mantissa == 0) {
----------------
sivachandra wrote:
> Can the shift and round operation here lead to an overflow? See this for why it can happen: https://github.com/llvm/llvm-project/blob/main/libc/src/__support/FPUtil/NormalFloat.h#L124
ah, yes it can. This is now fixed here and above, and a test has been added.

================
Comment at: libc/src/__support/str_to_float.h:579-586
+    if (*src == DECIMAL_POINT && afterDecimal) {
+      break; // this means that *src points to a second decimal point, ending
+             // the number.
+    } else if (*src == DECIMAL_POINT) {
+      afterDecimal = true;
+      ++src;
+      continue;
----------------
lntue wrote:
> What do you think about restructuring this part to:
> 
>   if (*src == DECIMAL_POINT) {
>     if (afterDecimal) {
>       // comments
>       break;
>     }
>     afterDecimal = true;
>     ++src;
>     continue;
>   }
Doing the change you suggested was a noticeable speed improvement.

================
Comment at: libc/src/__support/str_to_float.h:608
+
+      src += tempStrEnd - src;
+      exponent += add_to_exponent;
----------------
lntue wrote:
> Is this the same as src = tempStrEnd?
ah, I think I misunderstood how restrict worked. This is fixed now.

================
Comment at: libc/src/__support/str_to_float.h:666
+
+    mantissa = (mantissa * BASE) + digit;
+    seenDigit = true;
----------------
lntue wrote:
> Does it mean that we cannot parse the string longer than 16 hexadecimal digits?
correct, but that's because that's just how many bits there are space for in a double. The mantissa actually stores a little less, the extra bits are so that the rounding is more accurate.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113036/new/

https://reviews.llvm.org/D113036