[all-commits] [llvm/llvm-project] cc13f3: Correctly round FP -> BF16 when SDAG expands such ...

Wed Feb 21 09:37:13 PST 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: cc13f3ba45015254075434f0f94a2ea6ff4bc1b4
      https://github.com/llvm/llvm-project/commit/cc13f3ba45015254075434f0f94a2ea6ff4bc1b4
  Author: David Majnemer <david.majnemer at gmail.com>
  Date:   2024-02-21 (Wed, 21 Feb 2024)

  Changed paths:
    M llvm/include/llvm/CodeGen/TargetLowering.h
    M llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
    M llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
    M llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
    M llvm/lib/Target/NVPTX/NVPTXISelLowering.h
    M llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
    M llvm/test/CodeGen/AMDGPU/bf16.ll
    M llvm/test/CodeGen/AMDGPU/fmed3-cast-combine.ll
    M llvm/test/CodeGen/AMDGPU/fneg-modifier-casting.ll
    M llvm/test/CodeGen/AMDGPU/function-args.ll
    M llvm/test/CodeGen/AMDGPU/global-atomics-fp.ll
    M llvm/test/CodeGen/AMDGPU/isel-amdgpu-cs-chain-preserve-cc.ll
    M llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.bf16.ll
    M llvm/test/CodeGen/AMDGPU/local-atomics-fp.ll
    M llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll
    M llvm/test/CodeGen/NVPTX/bf16-instructions.ll

  Log Message:
  -----------
  Correctly round FP -> BF16 when SDAG expands such nodes (#82399)

We did something pretty naive:
- round FP64 -> BF16 by first rounding to FP32
- skip FP32 -> BF16 rounding entirely
- taking the top 16 bits of a FP32 which will turn some NaNs into
infinities

Let's do this in a more principled way by rounding types with more
precision than FP32 to FP32 using round-inexact-to-odd which will negate
double rounding issues.

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications