ruiu added a comment. Though this should work as a local patch, I think this is too specific to the NVidia-supplied binary blob that gets linked to your program. Is there any way to generalize this or change your binary? https://reviews.llvm.org/D47396