[PATCH] D98607: [NVPTX] CUDA provides a memcpy and memset

Johannes Doerfert via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 15 12:17:15 PDT 2021


jdoerfert added a comment.

In D98607#2626634 <https://reviews.llvm.org/D98607#2626634>, @tra wrote:

> It would be good to add a test.
>
> Both NVCC and clang currently lower memcpy to an explicit loop. I'm not sure what effect (if any) allowing memcpy/memset libcall would have on performance. We may want to benchmark it before landing.

I doubt I have the proper setup to do such benchmarking. I care about malloc/free, this was just a follow up because the same CUDA documentation paragraph says they are available. 
I'm fine with dropping this for now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98607/new/

https://reviews.llvm.org/D98607



More information about the llvm-commits mailing list