[all-commits] [llvm/llvm-project] 7ca6f5: AMDGPU: Reland: Codegen for v_dual_dot2acc_f32_f16...
Petar Avramovic via All-commits
all-commits at lists.llvm.org
Wed Jun 10 05:07:16 PDT 2026
Branch: refs/heads/users/petar-avramovic/vopd-dot2-reland
Home: https://github.com/llvm/llvm-project
Commit: 7ca6f53f91595ee0781de75667c039d2a170a4a6
https://github.com/llvm/llvm-project/commit/7ca6f53f91595ee0781de75667c039d2a170a4a6
Author: Petar Avramovic <Petar.Avramovic at amd.com>
Date: 2026-06-10 (Wed, 10 Jun 2026)
Changed paths:
M llvm/lib/Target/AMDGPU/GCNVOPDUtils.cpp
M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
M llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
M llvm/lib/Target/AMDGPU/VOP3PInstructions.td
M llvm/lib/Target/AMDGPU/VOPInstructions.td
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fdot2.f32.bf16.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fdot2.ll
Log Message:
-----------
AMDGPU: Reland: Codegen for v_dual_dot2acc_f32_f16/bf16 from VOP3
For V_DOT2_F32_F16 and V_DOT2_F32_BF16 add their VOPDName and mark
them with usesCustomInserter which will be used to add pre-RA register
allocation hints to preferably assign dst and src2 to the same physical
register. When the hint is satisfied, canMapVOP3PToVOPD recognises the
instruction as eligible for VOPD pairing by checking if it is VOP2 like:
dst==src2, no source modifiers, no clamp, and src1 is a register.
Mark both instructions as commutable to allow a literal in src1 to be
moved to src0, since VOPD only permits a literal in src0.
Original patch had a bug where it did not check if physical src
registers match register class of appropriate operand in fullVOPD
instructions, check is now done via isValidVOPDSrc.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list