[PATCH] D158059: [AMDGPU/wmma] - Disable 3-address syntax for f16

Jessica Del via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 4 06:43:03 PDT 2023


OutOfCache added a comment.

Sorry for responding so late.

The problem was only found with `zeroinitializer` matrices, which were reused as input for multiple `wmma_f16` instructions. However, it could happen for other, non-constant matrices as well, as long as the input and output accumulator registers are different.

After further discussion with other compiler engineers, we want to add a new pseudo instruction for the tied instruction. Then we can update the intrinsics in the packing patch to lower to these specific pseudos. That way, the original `wmma_f16` instruction can still be a three-address instruction in cases outside the patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158059/new/

https://reviews.llvm.org/D158059



More information about the llvm-commits mailing list