[PATCH] D113107: Support of expression granularity for _Float16.

John McCall via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 22 12:52:25 PDT 2022


rjmccall added inline comments.


================
Comment at: clang/test/CodeGen/X86/Float16-arithmetic.c:207
+// CHECK-NEXT:    [[EXT:%.*]] = fpext half [[TMP0]] to float
+// CHECK-NEXT:    store float [[EXT]], ptr [[RETVAL]], align 2
+// CHECK-NEXT:    [[TMP1:%.*]] = load half, ptr [[RETVAL]], align 2
----------------
zahiraam wrote:
> zahiraam wrote:
> > pengfei wrote:
> > > Not sure if we need a fptrunc and store the half value. The following tests have the same problem.
> > I think that's what we want?
> > // CHECK-LABEL: @RealOp(
> > // CHECK-NEXT:  entry:
> > // CHECK-NEXT:    [[A_ADDR:%.*]] = alloca half, align 2
> > // CHECK-NEXT:    store half [[A:%.*]], ptr [[A_ADDR]], align 2
> > // CHECK-NEXT:    [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
> > // CHECK-NEXT:    [[EXT:%.*]] = fpext half [[TMP0]] to float
> > // CHECK-NEXT:    [[UNPROMOTION:%.*]] = fptrunc float [[EXT]] to half
> > // CHECK-NEXT:    ret half [[UNPROMOTION]]
> > 
> > Do you agree? If this is correct, I will make the change the other operators.
> But I feel like we should be returning a float no?  In which case it will be more tricky (need to calculate the Address with the promoted elementype)? @rmjccall?
The function is declared as returning `_Float16`, not `float`.  This is therefore a question about when we're allowed to return a value in greater precision than the declared return type, which raises three sub-questions: one about ABI, one about language semantics, and one about our handling in the implementation.

The first is about ABI.  This mode is not supposed to be ABI-breaking, so whenever the ABI is in doubt, and the target makes a `_Float16` return incompatible with the ABI of a `float` return, we must use the former.  That means, at the very least, returning from a function with unknown call sites or calling a function with an unknown implementation.  We could potentially eliminate extra truncations here when we fully understand a call; for example, we could change the return type to `float` when the function is internal to a TU and not address-taken, or we could eliminate a trunc+extend pair after inlining.  It is fair to ask whether that's a good idea, however.

Anyway, concretely we're talking about two ABIs here:
- On x86_64, `_Float16` and `float` are not returned compatibly: they're both returned in `xmm0`, but the bit patterns are different, and the caller and callee must agree in order to preserve the value.
- On i386, `_Float16` and `float` *are* returned compatibly: they're both returned in `%st0`, promoted to the 80-bit format.

Let's assume for a second that we're interested in avoiding truncations in situations where the ABI doesn't limit us.  Then we have a question of language semantics, which is principally about this: does C's authorization of excess precision in intermediate results allows return values to propagate the excess precision?  The answer that appears to be yes, it does, per the explicit footnote at the end of the standard's description of the `return` statement:

> The return statement is not an assignment. The overlap restriction of 6.5.16.1 does not apply to the case of function return. The representation of floating-point values can have wider range or precision than implied by the type; a cast can be used to remove this extra range and precision.

Okay.  So the third question is about implementation: how should we take advantage of this flexibility, assuming we actually want to?  A major part of the reason we're doing explicit promoted emission in the frontend in this patch is that only the frontend has the required knowledge of when to force truncation; and a big part of *that* is that explicit casts and assignments both force truncation, and neither has a persistent semantic representation in LLVM IR.  We cannot distinguish between a truncation that was only done to satisfy the ABI and a truncation that was required by the language semantics.  Once we have a value as an `fp16` in IR, it's entirely possible that it underwent an operation that required excess precision to be discarded, one which no longer has any representation in IR.  I think the only reasonable way to avoid this would be to allow functions to directly return `float` (and receive `float`s as returns) with some sort of annotation that it actually has to be returned as an `fp16` to satisfy the ABI.  And within Clang, we would handle that by making the call/return emission interact with the promoted emitters we're adding in this patch.

We can put off all of that until later, though.  For now, we should just continue to return `_Float16`.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113107/new/

https://reviews.llvm.org/D113107



More information about the llvm-commits mailing list