[PATCH] D103225: [AMDGPU] Replace non-kernel function uses of LDS globals by pointers.
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jun 14 12:11:11 PDT 2021
rampitec added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/replace-lds-by-ptr-call-diamond-shape.ll:15
+; Pointer should be created.
+; CHECK: @lds_used_within_func.ptr = internal unnamed_addr addrspace(3) global i16 undef, align 2
+
----------------
hsmhsm wrote:
> rampitec wrote:
> > hsmhsm wrote:
> > > rampitec wrote:
> > > > This likely should not be i16. You could do it this way:
> > > > ```
> > > > @lds_used_within_func.ptr = internal unnamed_addr addrspace(3) global i8 addrspace(3)* undef, align 2
> > > > ```
> > > > Then use bitcast to i8 addrspace(3)* instead of ptrtoint on store.
> > > >
> > > > But I still think that creating module lds right here is better.
> > > The main goal of using i16 as pointer (with ptrtoint and GEP) is to save memory as much as possible. Using "i8 addrspace(3)*" means we are going to allocate memory size for pointer (not sure if it is 4 or 8 bytes, but definitely greater than 2 bytes). Hence the use of i16 type here.
> > >
> > > Also, I do not think, it is a good idea to create module lds right here. If that was the case, we could have put this code within module LDS pass itself instead of having separate pass (I did it initially, later we decided that pointer-replacement code should go as separate pass).
> > >
> > > This pass is already too big, let's not change the design again and further complicate the stuff which will further delay code submit, and which leads to other serious problems.
> > > The main goal of using i16 as pointer (with ptrtoint and GEP) is to save memory as much as possible. Using "i8 addrspace(3)*" means we are going to allocate memory size for pointer (not sure if it is 4 or 8 bytes, but definitely greater than 2 bytes). Hence the use of i16 type here.
> >
> > Should not sizeof local pointer be 2? Checked the DL: p3:32:32. Sigh!
> >
> > > Also, I do not think, it is a good idea to create module lds right here. If that was the case, we could have put this code within module LDS pass itself instead of having separate pass (I did it initially, later we decided that pointer-replacement code should go as separate pass).
> > >
> > > This pass is already too big, let's not change the design again and further complicate the stuff which will further delay code submit, and which leads to other serious problems.
> >
> > My concern is that you assume these i16 pointers will start a null, but that might not be the case if you rely on the module LDS pass to pack them into the struct along with other LDS and let it sort it.
> >
> > Can you add a test where you have these pointers and also another module LDS which will have higher alignment and size, so module LDS will pack them together? We could then inspect output of 2 passes and the ISA.
> No, what you concern here is not at all arise. Please have a look the new test case - replace-lds-by-ptr-lds-offsets.ll. I have given detailed description within this test, and I hope, it will clarify your above doubt.
Thanks, that helps. Tracked pointers and stores there, it seems to be consistent.
Matt raised a concern that IR passes may refuse to produce a 'noalias' answer in presence of null, but that is another concern which we may be able to solve later. In particular any operation on such pointer is invariant and noalias in a given function, essentially it is just a single load, so we may just annotate it accordingly. Not required for this patch though.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D103225/new/
https://reviews.llvm.org/D103225
More information about the llvm-commits
mailing list