[PATCH] D120129: [NVPTX] Enhance vectorization of ld.param & st.param
Yaxun Liu via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 18 09:53:30 PDT 2022
yaxunl added a comment.
In D120129#3390959 <https://reviews.llvm.org/D120129#3390959>, @tra wrote:
> In D120129#3390733 <https://reviews.llvm.org/D120129#3390733>, @kovdan01 wrote:
>
>> @tra Thanks for your comments! Updated the patch according the discussion about forcing alignment 16.
>>
>>> I think we should be able to do that to all no-kernel functions if we're compiling without -fgpu-rdc. I think we do reduce visibility of non-kernels in that case, but it would be good to make sure.
>>
>> Checked if we do reduce visibility in such cases, and looks like we do not.
>
> I was indeed mistaken. AFAICT, we only internalize some device-side variables.
> @yaxunl - Sam, do we change visibility for anything else? I know we must keep kernels visible, but the question is whether we ever internalize non-kernel functions and if not, whether we want to. In this case it would allow us to bump argument and return value alignment.
For HIP, we mark non-kernel device functions with hidden visibility and internalize them in a LLVM pass for -fno-gpu-rdc.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D120129/new/
https://reviews.llvm.org/D120129
More information about the llvm-commits
mailing list