[PATCH] D51042: [NVPTX] Remove ftz variants of cvt with rounding mode

Artem Belevich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 21 11:00:05 PDT 2018


tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.

In https://reviews.llvm.org/D51042#1207920, @bkramer wrote:

> In https://reviews.llvm.org/D51042#1207769, @tra wrote:
>
> > This is a surprise. PTX ISA does not mention that .ftz is not applicable to `cvt.*.f16.*` instructions. 
> >  Is it only `cvt` that does not support .ftz or does it impact other instructions?  PTX spec has add/sub/mul/fma/set/setp instructions that support f16 and have .ftz variant.
>
>
> It's only cvt with an explicit rounding mode. I actually ran the output of f16-instructions.ll with FTZ through ptxas and removed instructions until it compiled it. This might even be a bug in ptxas.


It may be worth filing a bug with NVIDIA to either fix the problem or clarify the docs.


Repository:
  rL LLVM

https://reviews.llvm.org/D51042





More information about the llvm-commits mailing list