[PATCH] D98607: [NVPTX] CUDA provides a memcpy and memset
Johannes Doerfert via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 15 12:17:15 PDT 2021
jdoerfert added a comment.
In D98607#2626634 <https://reviews.llvm.org/D98607#2626634>, @tra wrote:
> It would be good to add a test.
>
> Both NVCC and clang currently lower memcpy to an explicit loop. I'm not sure what effect (if any) allowing memcpy/memset libcall would have on performance. We may want to benchmark it before landing.
I doubt I have the proper setup to do such benchmarking. I care about malloc/free, this was just a follow up because the same CUDA documentation paragraph says they are available.
I'm fine with dropping this for now.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D98607/new/
https://reviews.llvm.org/D98607
More information about the llvm-commits
mailing list