[PATCH] D134859: [clang][Interp] Implement basic support for floating point values

David Blaikie via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Oct 5 10:54:18 PDT 2022


dblaikie added inline comments.


================
Comment at: clang/lib/AST/Interp/Floating.h:27-29
+  template <unsigned ReprBits> struct Repr;
+  template <> struct Repr<32> { using Type = float; };
+  template <> struct Repr<64> { using Type = double; };
----------------
tbaeder wrote:
> dblaikie wrote:
> > jcranmer-intel wrote:
> > > aaron.ballman wrote:
> > > > tbaeder wrote:
> > > > > aaron.ballman wrote:
> > > > > > tbaeder wrote:
> > > > > > > jcranmer-intel wrote:
> > > > > > > > tbaeder wrote:
> > > > > > > > > jcranmer-intel wrote:
> > > > > > > > > > aaron.ballman wrote:
> > > > > > > > > > > Er, how will this extend to `long double` where the number of bits is rather more difficult?
> > > > > > > > > > Or `half` and `bfloat`, which are both 16-bit floating-point types?
> > > > > > > > > I have spent some time with this today and tried to simply always use `APFloat` instead of a primitive type. Unfortunately that doesn't work because what we put on the stack is not the `Floating` (or `Integral`), but the underlying primitive type. So even if we do the final math (in `::add`, etc) via `APFloat`, we need something we can serialize to `char[]` so we can put it on the stack. Do you think that would work?
> > > > > > > > I don't know enough about the structure of the bytecode interpreter here to say for sure, but this smells to me like you're baking in an assumption that every primitive target type has a corresponding primitive type on the host. This assumption just doesn't hold when it comes to floating point (only two of the seven types, `float` and `double`, are generally portable, and even then, there be dragons in some corner cases).
> > > > > > > > 
> > > > > > > > If you do need to continue down this route, there are two requirements that should be upheld:
> > > > > > > > * The representation shouldn't assume that the underlying primitive type exists on host (bfloat16 and float128 are better test cases here).
> > > > > > > > * Conversion to/from host primitive types shouldn't be easy to accidentally do.
> > > > > > > > 
> > > > > > > > (Worth repeating again that bit size is insufficient to distinguish floating point types: `bfloat` and `half` are both 16-bit, PPC `long double` and IEEE 754 quad precision are both 128-bit, and x86 `long double` is 80 bits stored as 96 bits on 32-bit and 128 bits on 64-bit.)
> > > > > > > Well, is there a way to convert an APFloat to a char[] that would work instead of going to float/double and storing that? The only thing I see in the docs is `convertToHexString()` (and the docs don't mention whether the conversion is lossy). If not, do you think adding such a conversion to `APFloat` and its various implementations is the better way forward?
> > > > > > Let's avoid serializing the floats to strings so that we can parse the string to turn it back into a float later; that's going to have poor performance even if we do get all the corner cases correct regarding things like rounding, etc.
> > > > > > 
> > > > > > `APFloat` does not have any sort of serialization functionality beyond through strings representing the value. I think you'd have to invent such an interface.
> > > > > Do you know who I might talk to regrading such an interface, both the implementation as well as general feasibility?
> > > > I think there may be at least two ways to do this: use an `APFloat` and put the serialization interfaces there, or use an `APValue` and put the serialization interfaces there.
> > > > 
> > > > Because `APFloat` is an ADT in LLVM, I think it should probably go up on Discourse for broader discussion. @chandlerc is still listed as the code owner for ADTs but he's not been active in quite some time. Instead, I would recommend talking to @dblaikie (he's got a good eye for ADT work in general) and @foad, @RKSimon, and @sepavloff as folks who have recently been touching `APFloat`.
> > > > 
> > > > `APValue` is a Clang-specific class, and it already has some amount of serialization support, it seems (https://github.com/llvm/llvm-project/blob/main/clang/include/clang/AST/APValue.h#L54). From a quick look, it seems we're already using `APValue` in a reasonable number of places in the interpreter, so it might make sense to use this object consistently to represent all values in the new interpreter?
> > > Going through `APInt` is already possible in `APFloat`, and that might have some methods to go to char arrays.
> > Yeah - IR gets serialized APFloats somehow (maybe through some layers of indirection) so I'd suggest looking at how, say, a global float named constant gets lowered to bitcode to see how that APFloat is serialized.
> > 
> > If that's not enough of a pointer for you/others to find something helpful, I can look further to see what I can find (I got some of the way - a small example with just `extern const float v; const float v = 3.14;` and breaking on APFloat's ctor immediately finds the APFLoat being constructed from that literal - then I looked at llvm::GlobalVariable's ctor, which does break on the creation of `v`, but didn't continue further to figure out how the parsed value gets into the IR constant)
> So, I think this is implemented in `emitGlobalConstantFP()`: https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp#L3206. I now wonder how to implement this, since the number of chunks isn't fixed. And all this seems rather costly given that we'd have to do this every time we access a floating point value on the stack.
That looks like the lowering to assembly - I meant the lowering to LLVM IR bitcode, though perhaps it's similarly complicated.

As for variable length - I expect the strings encoding had variable length too, but either relied on null termination or a length prefix. The APFloat serialization could use a length prefix encoding to specify how many bytes to read/decode (I guess LLVM IR bitcode does something like that - though I think the bitcode records always encode their length, so it might be implicit).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134859/new/

https://reviews.llvm.org/D134859



More information about the cfe-commits mailing list