[llvm] [AMDGPU][GFX12] Restrict scalar subword loads to PAL (PR #117576)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 25 08:44:47 PST 2024
Juan Manuel Martinez =?utf-8?q?Caamaño?= <juamarti at amd.com>,
Juan Manuel Martinez =?utf-8?q?Caamaño?= <juamarti at amd.com>
Message-ID:
In-Reply-To: <llvm.org/llvm/llvm-project/pull/117576 at github.com>
================
@@ -6803,8 +6803,36 @@ bool AMDGPULegalizerInfo::legalizeSBufferLoad(LegalizerHelper &Helper,
unsigned Size = Ty.getSizeInBits();
MachineFunction &MF = B.getMF();
unsigned Opc = 0;
+
+ const unsigned MemSize = (Size + 7) / 8;
+ const Align MemAlign = B.getDataLayout().getABITypeAlign(
+ getTypeForLLT(Ty, MF.getFunction().getContext()));
+
+ // FIXME: When intrinsic definition is fixed, this should have an MMO already.
+ MachineMemOperand *MMO = MF.getMachineMemOperand(
+ MachinePointerInfo(),
+ MachineMemOperand::MOLoad | MachineMemOperand::MODereferenceable |
+ MachineMemOperand::MOInvariant,
+ MemSize, MemAlign);
+
if (Size < 32 && ST.hasScalarSubwordLoads()) {
assert(Size == 8 || Size == 16);
+ if (!ST.hasScalarSubwordBufferLoads()) {
+ // fallback to S_BUFFER_LOAD_UBYTE/USHORT
+ MI.getOperand(1).setIntrinsicID(Intrinsic::amdgcn_raw_buffer_load);
+
+ Register ZeroReg =
+ B.getMRI()->createGenericVirtualRegister(LLT::scalar(32));
+ B.buildConstant(ZeroReg, 0);
----------------
arsenm wrote:
```suggestion
auto Zero = B.buildConstant(S32, 0);
```
https://github.com/llvm/llvm-project/pull/117576
More information about the llvm-commits
mailing list