jlebar added a comment. How does this affect e.g. calling memcpy()? There isn't a standard library implementation of this on nvptx, but we do want calls to memcpy() to be lowered to llvm.memcpy so that they can be optimized. https://reviews.llvm.org/D42319