[PATCH] D113107: Support of expression granularity for _Float16.

Phoebe Wang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 23 19:50:18 PDT 2022


pengfei added a comment.

> I'm not sure what optimization you mean. Because the ABI returns 16-bit and 32-bit FP values differently, there really isn't a way that we can return a value without going through a truncation/extension cycle.

I explained it to Zahira offline. I forgot we have different expectation for the patch, thus we were talking different optimization to each other. I expected each backend has the ability to lower half operations. So I emphasized not to do the promotion or eliminate unnecessary promotion at the begining. While I see your point, we can only eliminate or combine unpromotion to the following promotion, so that we don't leave half operations to backends.

> There's potential to eliminate those with IPO, but we should definitely leave that for a different patch, for two reasons:

I agree with you, IPO is the only chance to do elimination.

> Somehow we've taken a huge step back on unpromotion, and I'm worried you're now doing the exact thing I didn't want us doing and forcing all the downstream clients to handle the possibility of a promoted result.

However, I am still not persuaded we need to consider the backends not supporting half operations (if I understand your downstream clients correctly).
There are two aspects I'd argue:

1. According to LanguageExtensions <https://clang.llvm.org/docs/LanguageExtensions.html#half-precision-floating-point>, a target needs to define ABI and the backend (at least) needs to provide the arguments passing/returning for `_Float16`. As we talked above, we cannot assume the ABI of `_Float16` is the same as `float` and promote to it in the front-end. So we cannot make it work without any backend change. And if we have to change backend, it's not a big deal to support lowering operations at the same time. We have target independent framework to help with that;
2. Promotion then unpromotion for each expression is cumbersome, inefficient and it makes it rather complicated for further optimization as you have described above.

WDYT?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113107/new/

https://reviews.llvm.org/D113107



More information about the llvm-commits mailing list