[PATCH] D135428: [NVPTX] Support neg{.ftz} for f16 and f16x2

Artem Belevich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 7 10:30:48 PDT 2022


tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.

Just curious -- what prompts this change?

Does it buy us anything performance-wise? AFAICT llvm may be generating better code for gpus w/o fp16 support -- it does xor on 32-bit value w/o splitting it into 16-bit halfs. https://godbolt.org/z/Wjx7ceT75
Or is it needed to flush fp16 denormals consistently?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135428/new/

https://reviews.llvm.org/D135428



More information about the llvm-commits mailing list