[PATCH] D135428: [NVPTX] Support neg{.ftz} for f16 and f16x2
Artem Belevich via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 7 10:30:48 PDT 2022
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.
Just curious -- what prompts this change?
Does it buy us anything performance-wise? AFAICT llvm may be generating better code for gpus w/o fp16 support -- it does xor on 32-bit value w/o splitting it into 16-bit halfs. https://godbolt.org/z/Wjx7ceT75
Or is it needed to flush fp16 denormals consistently?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D135428/new/
https://reviews.llvm.org/D135428
More information about the llvm-commits
mailing list