[PATCH] D120129: [NVPTX] Enhance vectorization of ld.param & st.param

Yaxun Liu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 18 09:53:30 PDT 2022


yaxunl added a comment.

In D120129#3390959 <https://reviews.llvm.org/D120129#3390959>, @tra wrote:

> In D120129#3390733 <https://reviews.llvm.org/D120129#3390733>, @kovdan01 wrote:
>
>> @tra Thanks for your comments! Updated the patch according the discussion about forcing alignment 16.
>>
>>> I think we should be able to do that to all no-kernel functions if we're compiling without -fgpu-rdc. I think we do reduce visibility of non-kernels in that case, but it would be good to make sure.
>>
>> Checked if we do reduce visibility in such cases, and looks like we do not.
>
> I was indeed mistaken. AFAICT, we only internalize some device-side variables.
> @yaxunl - Sam, do we change visibility for anything else? I know we must keep kernels visible, but the question is whether we ever internalize non-kernel functions and if not, whether we want to. In this case it would allow us to bump argument and return value alignment.

For HIP, we mark non-kernel device functions with hidden visibility and internalize them in a LLVM pass for -fno-gpu-rdc.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120129/new/

https://reviews.llvm.org/D120129



More information about the llvm-commits mailing list