[llvm] [AMDGPU] Fix image_msaa_load waitcnt insertion for pre-gfx12 (PR #90710)
David Stuttard via llvm-commits
llvm-commits at lists.llvm.org
Wed May 1 01:54:16 PDT 2024
https://github.com/dstutt created https://github.com/llvm/llvm-project/pull/90710
https://github.com/llvm/llvm-project/pull/90201 made some fixes for gfx12
image_msaa_load waitcnt insertion.
That fix might break in some situations for pre-gfx12 - this fixes that by
explitly checking for VSAMPLE which always requires a s_wait_samplecnt and
leaves the previous logic intact for non-gfx12.
>From 47b7c72abab4aa4116fc8fd27bfcd3120c804aa3 Mon Sep 17 00:00:00 2001
From: David Stuttard <david.stuttard at amd.com>
Date: Tue, 30 Apr 2024 14:52:00 +0100
Subject: [PATCH] [AMDGPU] Fix image_msaa_load waitcnt insertion for pre-gfx12
https://github.com/llvm/llvm-project/pull/90201 made some fixes for gfx12
image_msaa_load waitcnt insertion.
That fix might break in some situations for pre-gfx12 - this fixes that by
explitly checking for VSAMPLE which always requires a s_wait_samplecnt and
leaves the previous logic intact for non-gfx12.
---
llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index 15a1db51c6d78b..ebb5a1af7f2d17 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -187,12 +187,12 @@ VmemType getVmemType(const MachineInstr &Inst) {
const AMDGPU::MIMGInfo *Info = AMDGPU::getMIMGInfo(Inst.getOpcode());
const AMDGPU::MIMGBaseOpcodeInfo *BaseInfo =
AMDGPU::getMIMGBaseOpcodeInfo(Info->BaseOpcode);
- // The test for MSAA here is because gfx12+ image_msaa_load is actually
- // encoded as VSAMPLE and requires the appropriate s_waitcnt variant for that.
- // Pre-gfx12 doesn't care since all vmem types result in the same s_waitcnt.
- return BaseInfo->BVH ? VMEM_BVH
- : BaseInfo->Sampler || BaseInfo->MSAA ? VMEM_SAMPLER
- : VMEM_NOSAMPLER;
+ // We have to make an additional check for isVSAMPLE here since some
+ // instructions don't have a sampler, but are still classified as sampler
+ // instructions for the purposes of e.g. waitcnt.
+ return BaseInfo->BVH ? VMEM_BVH
+ : (BaseInfo->Sampler || SIInstrInfo::isVSAMPLE(Inst)) ? VMEM_SAMPLER
+ : VMEM_NOSAMPLER;
}
unsigned &getCounterRef(AMDGPU::Waitcnt &Wait, InstCounterType T) {
More information about the llvm-commits
mailing list