[PATCH] D113107: Support of expression granularity for _Float16.
Phoebe Wang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 23 19:50:18 PDT 2022
pengfei added a comment.
> I'm not sure what optimization you mean. Because the ABI returns 16-bit and 32-bit FP values differently, there really isn't a way that we can return a value without going through a truncation/extension cycle.
I explained it to Zahira offline. I forgot we have different expectation for the patch, thus we were talking different optimization to each other. I expected each backend has the ability to lower half operations. So I emphasized not to do the promotion or eliminate unnecessary promotion at the begining. While I see your point, we can only eliminate or combine unpromotion to the following promotion, so that we don't leave half operations to backends.
> There's potential to eliminate those with IPO, but we should definitely leave that for a different patch, for two reasons:
I agree with you, IPO is the only chance to do elimination.
> Somehow we've taken a huge step back on unpromotion, and I'm worried you're now doing the exact thing I didn't want us doing and forcing all the downstream clients to handle the possibility of a promoted result.
However, I am still not persuaded we need to consider the backends not supporting half operations (if I understand your downstream clients correctly).
There are two aspects I'd argue:
1. According to LanguageExtensions <https://clang.llvm.org/docs/LanguageExtensions.html#half-precision-floating-point>, a target needs to define ABI and the backend (at least) needs to provide the arguments passing/returning for `_Float16`. As we talked above, we cannot assume the ABI of `_Float16` is the same as `float` and promote to it in the front-end. So we cannot make it work without any backend change. And if we have to change backend, it's not a big deal to support lowering operations at the same time. We have target independent framework to help with that;
2. Promotion then unpromotion for each expression is cumbersome, inefficient and it makes it rather complicated for further optimization as you have described above.
WDYT?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D113107/new/
https://reviews.llvm.org/D113107
More information about the llvm-commits
mailing list