[PATCH] D81638: AMDGPU/GlobalISel: Fix 96-bit local loads

Mirko Brkusanin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 12 06:58:59 PDT 2020


mbrkusanin added a comment.

Yes, it basically avoids problems of not being able to select 3x32 for local address space. SDag was breaking these down to a ds_read_b64 and ds_read_b32 so I did the same thing for GlobalISel.

I've looked at .td files and it seems that the following pattern can be added so ds_read_b96 can be selected

  foreach vt = VReg_64.RegTypes in {
  defm : DSReadPat_mc <DS_READ_B64, vt, "load_alignX_local">;
  }

but I'm not sure what should the minimal alignment (X) be for this specific instruction. Any idea? For alignment of 4 every test will pass but, otherwise we'll need to break some cases to b64, b32 pairs.



================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:275
   case 96:
-    if (!ST.hasDwordx3LoadStores())
+    if (!ST.hasDwordx3LoadStores() || AS == AMDGPUAS::LOCAL_ADDRESS)
       return false;
----------------
arsenm wrote:
> The address space doesn't make this special? This willl break SI?
SI will be fine because hasDwordx3LoadStores will be false. Local address space uses ds_read and ds_write so the name is slightly confusing.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81638/new/

https://reviews.llvm.org/D81638





More information about the llvm-commits mailing list