[all-commits] [llvm/llvm-project] 4b24ab: Reland "[NVPTX] Add folding for cvt.rn.bf16x2.f32"...
Alex MacLean via All-commits
all-commits at lists.llvm.org
Fri Dec 6 13:30:31 PST 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 4b24ab4be9351ef822fd8fd546237eabd8c3ba57
https://github.com/llvm/llvm-project/commit/4b24ab4be9351ef822fd8fd546237eabd8c3ba57
Author: Alex MacLean <amaclean at nvidia.com>
Date: 2024-12-06 (Fri, 06 Dec 2024)
Changed paths:
M llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
M llvm/test/CodeGen/NVPTX/bf16-instructions.ll
M llvm/test/CodeGen/NVPTX/bf16x2-instructions-approx.ll
M llvm/test/CodeGen/NVPTX/bf16x2-instructions.ll
M llvm/test/CodeGen/NVPTX/convert-sm80.ll
M llvm/test/CodeGen/NVPTX/fma-relu-contract.ll
M llvm/test/CodeGen/NVPTX/fma-relu-fma-intrinsic.ll
M llvm/test/CodeGen/NVPTX/fma-relu-instruction-flag.ll
Log Message:
-----------
Reland "[NVPTX] Add folding for cvt.rn.bf16x2.f32" (#116417)
Reland https://github.com/llvm/llvm-project/pull/116109.
Fixes issue where operands were flipped.
Per the PTX spec, a mov instruction packs the first operand as low, and
the second operand as high:
> ```
> // pack two 16-bit elements into .b32
> d = a.x | (a.y << 16)
> ```
On the other hand cvt.rn.f16x2.f32 instructions take high, than low
operands:
> For .f16x2 and .bf16x2 instruction type, two inputs a and b of .f32
type are converted into .f16 or .bf16 type and the converted values are
packed in the destination register d, such that the value converted from
input a is stored in the upper half of d and the value converted from
input b is stored in the lower half of d
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list