[PATCH] D146771: [support] Provide overload to PrintNumber that use C++ types

Fri Mar 24 09:47:03 PDT 2023

paulkirth added inline comments.

================
Comment at: llvm/include/llvm/Support/JSON.h:350
   // Integers (except boolean and uint64_t).
   // Must be non-narrowing convertible to int64_t.
+  template <
----------------
jhenderson wrote:
> paulkirth wrote:
> > I didn't love that I had to change these, but the non-narrowing issue forced me to add `unsigned long long`. I looked into just pushing all of the unsigned integers into the `uint64_t` version, but ran into some strange behavior and test failures that I couldn't explain.
> > 
> > For example an MCA test failed, but the JSON appeared equivalent. The difference was that despite being equivalent "numbers", the values were not integers that the `getInteger()` API for json::Objects could resolve (i.e. it relies on them being `int64_t`, and some were now `uint64_t`).
> > 
> > Some of that likely is tied to the JSON rules for numeric types, but someone with more knowledge here should probably weigh in. Also template metaprogramming isn't something I do often, so there may be better ways to express these constraints.
> I don't think the changes you've made will work in a cross-platform manner for the same sort of reasoning as why you need the `unsigned long` overloads of printNumber: an unsigned long is a 64-bit value on some platforms, but `uint64_t` might be defined to `unsigned long long`. In such cases, unsigned long will be neither uin64_t, nor unsigned long long, and so the enable_ifs won't work.
> 
> A 64-bit integral will have the following traits: `std::is_integral`, and `sizeof(T) == sizeof(uint64_t)` (you could hard code the latter to 8). So the current behaviour of comparing the type against `uint64_t` isn't really correct. I think you want to change the unsigned case to:
> ```
>   template <
>       typename T,
>       typename = std::enable_if_t<std::is_integral<T>::value &&
>                                   std::is_unsigned<T>::value &&
>                                   !std::is_same<T, bool>::value &&
>                                   sizeof(T) == sizeof(uint64_t)>>
> ```
> The signed one should look like the following:
> ```
>   template <
>       typename T,
>       typename = std::enable_if_t<std::is_integral<T>::value &&
>                                   std::is_signed<T>::value &&
>                                   sizeof(T) == sizeof(int64_t)>>
> ```
> Honestly, I'm not sure why the second one is listed as multiple `enable_if_t` args: I would have that that would be unnecessary, but if it is necessary (my own template metaprogramming sometimes needs a little bit of trial and error!), you could split the above into three separate ones.
> 
> Note that the comparison against bool is moved because `is_signed` explicitly excludes `bool`, but `is_unsigned` includes it.
> 
> You could, if you wanted, defined your own type trait that folds together the is_integral and sizeof terms, to avoid the duplication, but I'm not sure it would add enough to be worth it.
> 
> Is there any form of testing that shows that the different fundamental types create the correct underlying type?
> I don't think the changes you've made will work in a cross-platform manner for the same sort of reasoning as why you need the `unsigned long` overloads of printNumber: an unsigned long is a 64-bit value on some platforms, but `uint64_t` might be defined to `unsigned long long`. In such cases, unsigned long will be neither uin64_t, nor unsigned long long, and so the enable_ifs won't work.
> 
> A 64-bit integral will have the following traits: `std::is_integral`, and `sizeof(T) == sizeof(uint64_t)` (you could hard code the latter to 8). So the current behaviour of comparing the type against `uint64_t` isn't really correct. I think you want to change the unsigned case to:
> ```
>   template <
>       typename T,
>       typename = std::enable_if_t<std::is_integral<T>::value &&
>                                   std::is_unsigned<T>::value &&
>                                   !std::is_same<T, bool>::value &&
>                                   sizeof(T) == sizeof(uint64_t)>>
> ```
> The signed one should look like the following:
> ```
>   template <
>       typename T,
>       typename = std::enable_if_t<std::is_integral<T>::value &&
>                                   std::is_signed<T>::value &&
>                                   sizeof(T) == sizeof(int64_t)>>
> ```

Huh, I forgot you could use `sizeof` in template expressions. That certainly cleans things up. I think for the signed case we will want: `sizeof(T) <= sizeof(int64_t)`, since this code assumes everything smaller than uint64 is signed. It also seems that some dependent code makes similar assumptions.

> Honestly, I'm not sure why the second one is listed as multiple `enable_if_t` args: I would have that that would be unnecessary, but if it is necessary (my own template metaprogramming sometimes needs a little bit of trial and error!), you could split the above into three separate ones.
> 

I just went with the existing convention. I think your suggestion should work, but I haven't tried it yet.

> Note that the comparison against bool is moved because `is_signed` explicitly excludes `bool`, but `is_unsigned` includes it.
> 
> You could, if you wanted, defined your own type trait that folds together the is_integral and sizeof terms, to avoid the duplication, but I'm not sure it would add enough to be worth it.
> 
> Is there any form of testing that shows that the different fundamental types create the correct underlying type?

The existing tests in  JSONTest.cpp all seem to use `int64_t`/`uint64_t`. That makes sense to an extent, since they more or less forced anything that isn't `uint64_t` into `int64_t`. But there isn't any testing of the underlying C++ types. 

================
Comment at: llvm/include/llvm/Support/ScopedPrinter.h:201

-  virtual void printNumber(StringRef Label, uint64_t Value) {
-    startLine() << Label << ": " << Value << "\n";
+  virtual void printNumber(StringRef Label, signed char Value) {
+    startLine() << Label << ": " << int(Value) << "\n";
----------------
jhenderson wrote:
> paulkirth wrote:
> > Another potential approach would be to rely on conversion, and just pic a select few overloads for integer types. I usually try to avoid things like that, but 1) this is a lot of duplicated code, and 2) for many of these types the conversion is likely happening when writing to the stream.
> Looking at the `Value` class above, it only has explicit overloads for `uint64_t` and `int64_t` (effectively). Other integer types happen to work because these two overloads have ended up being instantiated and so the types can convert. I might be wrong, but I suspect if you tried to use that class with e.g. an `int` value, but without any usage with an `int64_t`, you'd get a missing overload error, or something stranger (like it converting to a pointer or something)! I'm a little unsure which one `char` will use, and I suspect it's platform dependent.
> 
> In your case, I think you could get away with `long long` and `unsigned long long`, but in your testing, I'd test with every fundamental integer type to make sure it works without a problem.
Yeah, I the more I think about this, the more I feel like this will be simpler to implement. let me give this a shot and see if that's actually the case. And you're right, char is _probably_ going to need some kind of special handling.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146771/new/

https://reviews.llvm.org/D146771