[PATCH] D82788: AMDGPU: Fix alignment requirements for 96bit and 128bit local loads and stores

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 14 15:29:16 PDT 2020


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:1377
+      // ds_read/write_b96 require 16-byte alignment. gfx9 and onward has
+      // unaligned access support but windows likes to keep it dword aligned.
+      bool IsGFX9Plus = Subtarget->getGeneration() >= AMDGPUSubtarget::GFX9;
----------------
mbrkusanin wrote:
> nhaehnle wrote:
> > mbrkusanin wrote:
> > > nhaehnle wrote:
> > > > arsenm wrote:
> > > > > nhaehnle wrote:
> > > > > > arsenm wrote:
> > > > > > > This is a windows driver bug and doesn't deserve mentioning here. We do not know what the host OS is
> > > > > > I agree that it's a bug, but I find it reasonable to mention it here. I'd change the comment though to specifically call out that this should be considered a bug in the Windows KMD.
> > > > > It's not specific to this instance though, this belongs with the place where we assume unaligned access for amdhsa
> > > > Okay, that's fair.
> > > Is the new wording ok?
> > No, it makes an incorrect statement. The alignment requirement only exists up to gfx8, so it should say "... require 16-byte alignment on gfx8 and older".
> > 
> > I also don't think we enforce dword alignment on gfx9+ anywhere for any other kind of memory access, so we shouldn't do it here. The mess on Windows is a problem, but if we really need to work around it, it should be a target "feature" (misfeature, really, but oh well).
> So you're recommending two new features? One to allow misaligned DS instructions and one that requires dword alignment (currently hasUnalignedDSAccess but should be renamed in this case) alongside default requirement of 16 byte?
I think the features are "supports DS unaligned access" and "unaligned access is enabled". We have unaligned-buffer-access already, which should be moved into the target CPU definitions on targets where it works. We should add a new equivalent feature for DS, also added to the target devices. We also need another feature to enable the unaligned access corresponding to the driver setting.

I'm not sure the unaligned DS alignment helps with the b64/b96/b128 cases (I think they still require alignment even with unaligned support? but this is the kind of thing I always have a hard time figuring out from the documentation)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82788/new/

https://reviews.llvm.org/D82788





More information about the llvm-commits mailing list