[llvm] [AMDGPU] CodeGen for GFX12 8/16-bit SMEM loads (PR #77633)

Fri Jan 12 04:15:18 PST 2024

jayfoad wrote:

> > > We have some combine logic specifically avoiding extload formation on the basis of sub-dword scalar loads not being supported. I assume those are updated in a later patch?
> > 
> > 
> > I have follow up patches to update SITargetLowering::isLegalAddressingMode and to disable AMDGPULateCodeGenPrepare.
> > The only other vaguely related thing I could find is SITargetLowering::widenLoad, which should perhaps be disabled.
> > Are you thinking of something else?
> 
> I was thinking TLI.shouldReduceLoadWidth and the LateCodeGenPrepare handling. Plus some of the kernel argument load logic (which we have 3 copies of)

I've added TODO comments at all the relevant places I could find.

`AMDGPUCallLowering::lowerFormalArgumentsKernel` does not appear to widen subword loads yet (there is a TODO comment for that).

I'm not sure whether `SITargetLowering::widenLoad` needs any update. What optimization is it trying to perform?

https://github.com/llvm/llvm-project/pull/77633