[PATCH] D69878: Consoldiate internal denormal flushing controls

Justin Lebar via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Nov 8 11:48:25 PST 2019


jlebar added a comment.

> AMDGPU wants a distinct control for f32 flushing from f16/f64, and as far as I can tell the same is true for NVPTX (based on the attribute name).

I may be corrected, but I believe nvptx only supports ftz for f32.

> Double-precision instructions support subnormal inputs and results. Single-precision instructions support subnormal inputs and results by default for sm_20 and subsequent targets, and flush subnormal inputs and results to sign-preserving zero for sm_1x targets. The optional .ftz modifier on single-precision instructions provides backward compatibility with sm_1x targets by flushing subnormal inputs and results to sign-preserving zero regardless of the target architecture.

https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#floating-point-instructions


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69878/new/

https://reviews.llvm.org/D69878





More information about the cfe-commits mailing list