[all-commits] [llvm/llvm-project] 4b24ab: Reland "[NVPTX] Add folding for cvt.rn.bf16x2.f32"...

Fri Dec 6 13:30:31 PST 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 4b24ab4be9351ef822fd8fd546237eabd8c3ba57
      https://github.com/llvm/llvm-project/commit/4b24ab4be9351ef822fd8fd546237eabd8c3ba57
  Author: Alex MacLean <amaclean at nvidia.com>
  Date:   2024-12-06 (Fri, 06 Dec 2024)

  Changed paths:
    M llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
    M llvm/test/CodeGen/NVPTX/bf16-instructions.ll
    M llvm/test/CodeGen/NVPTX/bf16x2-instructions-approx.ll
    M llvm/test/CodeGen/NVPTX/bf16x2-instructions.ll
    M llvm/test/CodeGen/NVPTX/convert-sm80.ll
    M llvm/test/CodeGen/NVPTX/fma-relu-contract.ll
    M llvm/test/CodeGen/NVPTX/fma-relu-fma-intrinsic.ll
    M llvm/test/CodeGen/NVPTX/fma-relu-instruction-flag.ll

  Log Message:
  -----------
  Reland "[NVPTX] Add folding for cvt.rn.bf16x2.f32" (#116417)

Reland https://github.com/llvm/llvm-project/pull/116109.

Fixes issue where operands were flipped. 

Per the PTX spec, a mov instruction packs the first operand as low, and
the second operand as high:
> ```
> // pack two 16-bit elements into .b32
> d = a.x | (a.y << 16)
> ```
On the other hand cvt.rn.f16x2.f32 instructions take high, than low
operands:
> For .f16x2 and .bf16x2 instruction type, two inputs a and b of .f32
type are converted into .f16 or .bf16 type and the converted values are
packed in the destination register d, such that the value converted from
input a is stored in the upper half of d and the value converted from
input b is stored in the lower half of d

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications