[PATCH] D69878: Consoldiate internal denormal flushing controls
Justin Lebar via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Nov 8 11:48:25 PST 2019
jlebar added a comment.
> AMDGPU wants a distinct control for f32 flushing from f16/f64, and as far as I can tell the same is true for NVPTX (based on the attribute name).
I may be corrected, but I believe nvptx only supports ftz for f32.
> Double-precision instructions support subnormal inputs and results. Single-precision instructions support subnormal inputs and results by default for sm_20 and subsequent targets, and flush subnormal inputs and results to sign-preserving zero for sm_1x targets. The optional .ftz modifier on single-precision instructions provides backward compatibility with sm_1x targets by flushing subnormal inputs and results to sign-preserving zero regardless of the target architecture.
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#floating-point-instructions
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D69878/new/
https://reviews.llvm.org/D69878
More information about the cfe-commits
mailing list