[PATCH] D103225: [AMDGPU] Replace non-kernel function uses of LDS globals by pointers.

Mahesha S via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 9 20:44:32 PDT 2021


hsmhsm added a comment.

In D103225#2808734 <https://reviews.llvm.org/D103225#2808734>, @rampitec wrote:

> In D103225#2808671 <https://reviews.llvm.org/D103225#2808671>, @rampitec wrote:
>
>> In D103225#2808121 <https://reviews.llvm.org/D103225#2808121>, @hsmhsm wrote:
>>
>>> Implemented approach(2). Here we actually do not need __builtin_amdgcn_mbcnt_hi(~0u, __builtin_amdgcn_mbcnt_lo(~0u, 0u)).  Irrespective of the wave64 or wave32, _builtin_amdgcn_mbcnt_lo(~0u, 0u) is enough. The reason is - we only want to identify lane 0. On the other hand, for wave64, if we wanted to identify any lane greater than 31, then we would need __builtin_amdgcn_mbcnt_hi(~0u, __builtin_amdgcn_mbcnt_lo(~0u, 0u)).
>>
>> As far as I understand mbcnt_lo will return 0 for any thread >= 32, so you still need to use mbcnt_hi.

Actually, in wave64 mode, for any thread >= 32, mbcnt_lo returns 32 not 0, practically speaking we can do without mbcnt_hi, but may be it is good and safe to have mbcnt_hi for wave64, I will make changes accordingly.

>> @b-sumner why do you suggest to nest the hi and lo calls? I think it shall be (__builtin_amdgcn_mbcnt_lo(~0u, 0u) + __builtin_amdgcn_mbcnt_hi(~0u, 0u)) == 0.
>
> Anyhow, manual says:
>
>   Example to compute each thread's position in 0..63:
>   v_mbcnt_lo_u32_b32 v0, -1, 0
>   v_mbcnt_hi_u32_b32 v0, -1, v0
>   // v0 now contains ThreadPosition
>
> So it tells to nest it. Plus will fail on lane 32 as far as I understand.

No it does not fail on lane 32. For lane 32, v_mbcnt_lo returns 32, and you pass this return value from v_mbcnt_lo to v_mbcnt_hi.  v_mbcnt_hi will not add any additioncal count (on top of 32), hence it rerurns back 32 again, that is lane position for lane 32.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103225/new/

https://reviews.llvm.org/D103225



More information about the llvm-commits mailing list