[PATCH] D50839: [llvm] Optimize YAML::isNumeric

Kirill Bobyrev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Aug 17 02:44:43 PDT 2018


kbobyrev added inline comments.


================
Comment at: llvm/include/llvm/Support/YAMLTraits.h:454
+inline bool isNumeric(StringRef S) {
+  if (S.empty())
+    return false;
----------------
zturner wrote:
> What would happen if we re-wrote this entire function as:
> 
> ```
> inline bool isNumeric(StringRef S) {
>   uint64_t N;
>   int64_t I;
>   APFloat F;
>   return S.getAsInteger(N) || S.getAsInteger(I) || (F.convertFromString(S) == opOK);
> }
> ```
> 
> Would this a) Be correct, and b) have similar performance characteristics to what you've got here?
Thank you for the suggestion!

I have tried the proposed approach, but there are several caveats:

First, `APInt` (which I believe should be used in this case since YAML numbers are of arbitrary length) parsing does not look simpler than the current approach (and it's also unnecessary overhead and potentially some cases which are invalid in YAML but are perfectly fine in `APInt` parser). An example would be the prefix of octal numbers: `APInt` accepts `0` while it should be `0o` in YAML, so the `Radix` should be manually inferred anyway.

The main problem, however, is with the `APFloat` parser, which accepts a huge number of items which are not valid in YAML numeric format. Examples are:

* `.`
* `.e+1`
* `.e+`
* `.e`

Even worse, the parser appears to have bugs. I was able to find several classes of inputs which cause global-buffer-overflow caught by AddressSanitizer (e.g. `.+`). This should be investigated independently. However, the above cases lead me to believe that:

1) The LLVM parser is likely to have a huge number of cases which are invalid in YAML numeric format but are valid `APFloat`s. Finding all of these cases is non-trivial and is probably not rewarding.
2) The parser is unreliable.

What do you think?


https://reviews.llvm.org/D50839





More information about the cfe-commits mailing list