[PATCH] D138044: AMDGPU/GlobalISel: Fix crash after mad/fma_mix fails selection
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 16 09:04:50 PST 2022
arsenm added a comment.
In D138044#3930822 <https://reviews.llvm.org/D138044#3930822>, @Petar.Avramovic wrote:
> Previous version of this patch would not make a copy and it would use sgpr directly. Visible only in mir. sifoldoperands folds this copy later anyway.
>
> %11:vgpr_32 = V_FMA_MIX_F32 2, %0, 0, %1, 8, %8, 0, 0, 0, implicit $mode, implicit $exec
I do think we should lean more on post-selection operand folding for dealing with SGPR copies. We should write a new and better version. The current version is "backwards". Traditionally after selection we would have excess SGPRs and try to remove them to legalize instructions. With RegBankSelect making every operand a VGPR, we need one written from the perspective of everything starts as VGPRs and needs to make an optimal choice for SGPR operands
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4817
+ // to avoid creating dead copy instruction.
+ std::tie(Src, Mods) = selectVOP3ModsImpl(Root);
+ MachineInstr *MI = getDefIgnoringCopies(Src, *MRI);
----------------
Pierre-vh wrote:
> With this it looks like we will end up calling `getVOP3ModsImpl` twice. I guess this isn't called often enough (and the "get" function seems simple enough) to cause performance issues but it still feels suboptimal
> Could this be refactored to make it so selectVOP3ModsImpl doesn't need to call "get" again?
> If not then I suppose it's fine as is
>
I agree, you should know everything after the first round of selectVOP3PMadMixModsImpl and should just pass through what to do
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D138044/new/
https://reviews.llvm.org/D138044
More information about the llvm-commits
mailing list