[PATCH] D51042: [NVPTX] Remove ftz variants of cvt with rounding mode
Artem Belevich via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 21 11:00:05 PDT 2018
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.
In https://reviews.llvm.org/D51042#1207920, @bkramer wrote:
> In https://reviews.llvm.org/D51042#1207769, @tra wrote:
>
> > This is a surprise. PTX ISA does not mention that .ftz is not applicable to `cvt.*.f16.*` instructions.
> > Is it only `cvt` that does not support .ftz or does it impact other instructions? PTX spec has add/sub/mul/fma/set/setp instructions that support f16 and have .ftz variant.
>
>
> It's only cvt with an explicit rounding mode. I actually ran the output of f16-instructions.ll with FTZ through ptxas and removed instructions until it compiled it. This might even be a bug in ptxas.
It may be worth filing a bug with NVIDIA to either fix the problem or clarify the docs.
Repository:
rL LLVM
https://reviews.llvm.org/D51042
More information about the llvm-commits
mailing list