[PATCH] D20297: AMDGPU/SI: Make kernarg.segment.ptr point to implicit arguments for non HSA
Jan Vesely via llvm-commits
llvm-commits at lists.llvm.org
Fri May 20 10:50:27 PDT 2016
jvesely added a comment.
In http://reviews.llvm.org/D20297#435687, @tstellarAMD wrote:
> In http://reviews.llvm.org/D20297#431404, @arsenm wrote:
>
> > In http://reviews.llvm.org/D20297#431388, @jvesely wrote:
> >
> > > In http://reviews.llvm.org/D20297#431315, @arsenm wrote:
> > >
> > > > In http://reviews.llvm.org/D20297#431298, @jvesely wrote:
> > > >
> > > > > In http://reviews.llvm.org/D20297#431268, @arsenm wrote:
> > > > >
> > > > > > We don't actually want these. We now have the kernarg.segment.ptr intrinsic, so the library should just directly read the offsets from there
> > > > >
> > > > >
> > > > > I assume it points to the beginning of the kernel args.
> > > > > Is the idea to have another intrinsic to read from the end (workdim, global offset), or is it OK to have intrinsics for those?
> > > >
> > > >
> > > > We should fix clover to not read from the end of the arguments (I thought this is what it already did)?
> > >
> > >
> > > some are before the kernel args (global size, local size, ngroups -- this is done by radeonsi/r600 mesa driver). others are after the kernel arguments (work_dim, global-offset -- this is done by clover).
> > > The intention was to move all implicit arguments after the explicit ones, so new implicit arguments can be added without breaking ABI (moving explicit arguments).
> >
> >
> > I think clover should move towards matching the HSA ABI closer. Most of the implicit arguments would then be user SGPR inputs like HSA uses, and the number of implicit args would be reduced.
>
This is a bit confusing. If the implicit args should be in SGPRs, wouldn't we need one intrinsic per implicit arg? and how would the number of implicit args be reduced? the only redundant one is global size (libclc computes it as num_groups * local_size).
> I agree. I actually started working on this last week. I think all implicit args that aren't passed in SGPRs should be at the end of the kernarg segment.
My idea was to switch work_dim and newly implemented global_offset, as those are already appended (the rest can be switched by patches to libclc and clover). However, doesn't this contradict Matt's suggestion to pass implicit arguments in SGPRs?
Repository:
rL LLVM
http://reviews.llvm.org/D20297
More information about the llvm-commits
mailing list