[PATCH] D158059: [AMDGPU/wmma] - Disable 3-address syntax for f16
Jessica Del via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 4 06:43:03 PDT 2023
OutOfCache added a comment.
Sorry for responding so late.
The problem was only found with `zeroinitializer` matrices, which were reused as input for multiple `wmma_f16` instructions. However, it could happen for other, non-constant matrices as well, as long as the input and output accumulator registers are different.
After further discussion with other compiler engineers, we want to add a new pseudo instruction for the tied instruction. Then we can update the intrinsics in the packing patch to lower to these specific pseudos. That way, the original `wmma_f16` instruction can still be a three-address instruction in cases outside the patch.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158059/new/
https://reviews.llvm.org/D158059
More information about the llvm-commits
mailing list