[PATCH] D81638: AMDGPU/GlobalISel: Fix 96-bit local loads
Mirko Brkusanin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jun 12 06:58:59 PDT 2020
mbrkusanin added a comment.
Yes, it basically avoids problems of not being able to select 3x32 for local address space. SDag was breaking these down to a ds_read_b64 and ds_read_b32 so I did the same thing for GlobalISel.
I've looked at .td files and it seems that the following pattern can be added so ds_read_b96 can be selected
foreach vt = VReg_64.RegTypes in {
defm : DSReadPat_mc <DS_READ_B64, vt, "load_alignX_local">;
}
but I'm not sure what should the minimal alignment (X) be for this specific instruction. Any idea? For alignment of 4 every test will pass but, otherwise we'll need to break some cases to b64, b32 pairs.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:275
case 96:
- if (!ST.hasDwordx3LoadStores())
+ if (!ST.hasDwordx3LoadStores() || AS == AMDGPUAS::LOCAL_ADDRESS)
return false;
----------------
arsenm wrote:
> The address space doesn't make this special? This willl break SI?
SI will be fine because hasDwordx3LoadStores will be false. Local address space uses ds_read and ds_write so the name is slightly confusing.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D81638/new/
https://reviews.llvm.org/D81638
More information about the llvm-commits
mailing list