[PATCH] D158059: [AMDGPU/wmma] - Disable 3-address syntax for f16
Piotr Sobczak via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 18 07:53:59 PDT 2023
piotr added a comment.
> Then, as you say, our register allocation needs to be intelligent enough to keep the matrices packed.
> How would you define the instructions for this to work?
Unfortunately, looking at that a bit more I don't think the scheme I proposed is feasible. Even if we add some extra copies to preserve the other half, the twoaddressinstruction pass will not be able to understand that.
The only alternative I could suggest instead of adding new intrinsics seems to be to implement the packing entirely in the codegen (e.g. after twoaddressinstruction pass).
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158059/new/
https://reviews.llvm.org/D158059
More information about the llvm-commits
mailing list