[llvm] [AMDGPU] Do not count implicit VGPRs in SIInsertWaitcnts (PR #109049)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 20 01:34:59 PDT 2024
================
@@ -1752,6 +1752,15 @@ bool SIInsertWaitcnts::generateWaitcntInstBefore(MachineInstr &MI,
const bool IsVGPR = TRI->isVectorRegister(*MRI, Op.getReg());
for (int RegNo = Interval.first; RegNo < Interval.second; ++RegNo) {
if (IsVGPR) {
+ // Implicit VGPR defs and uses are never a part of the memory
+ // instructions description and usually present to account for
+ // super-register liveness. Tied implicit sources on loads though
+ // are real uses.
+ // TODO: Most of the other instructions also have implicit uses
+ // for the liveness accounting only.
+ if (Op.isImplicit() && MI.mayLoadOrStore() && !Op.isTied())
----------------
jayfoad wrote:
> So it reads v0 and merges the load back. This may be not needed for a dword load, but what if we read 16-bit and preserve the other half? The pattern will be the same, a tied def.
All these cases are handled elsewhere as WAW hazards if necessary (usually it is not necessary because buffer loads complete in order anyway). You do not need to handle them here. The code here is only for RAW hazards.
https://github.com/llvm/llvm-project/pull/109049
More information about the llvm-commits
mailing list