[llvm] [AMDGPU][True16][CodeGen] optimize codegen for mad-mix in true16 (PR #124995)
Brox Chen via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 21 10:24:41 PDT 2025
================
@@ -5878,6 +5878,14 @@ AMDGPUInstructionSelector::selectVOP3PMadMixModsImpl(MachineOperand &Root,
CheckAbsNeg();
}
+ // Since we looked through FPEXT and removed it, we must also remove
+ // G_TRUNC. G_TRUNC to 16-bits would have a destination in RC VGPR_16, which
+ // is not compatible with MadMix instructions
+ Register PeekSrc = Src;
+ if (mi_match(PeekSrc, *MRI, m_GTrunc(m_Reg(PeekSrc))) &&
+ MRI->getType(PeekSrc).getSizeInBits() == 32)
----------------
broxigarchen wrote:
Hi Matt. This code is triggererd in the middle of the isel pass, and the non-32 bit truncate is also generated in the middle of the isel. and thus we cannot use mir test to validate it.
For sdag, I've turn frem.ll to true16 format which is the case I saw there is a 'i64->i16' truncate in v4f16 test.
For GISEL since it's not fully supported in true16 yet and I actually cannot fully verified it. In fact, I think it's better to do this seperately
https://github.com/llvm/llvm-project/pull/124995
More information about the llvm-commits
mailing list