[Parallel_libs-commits] [PATCH] D24619: [SE] Cache CUDA modules
Justin Lebar via Parallel_libs-commits
parallel_libs-commits at lists.llvm.org
Thu Sep 15 13:22:26 PDT 2016
jlebar added inline comments.
================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatformDevice.cpp:130
@@ +129,3 @@
+ return CUresultToError(Result, "cuModuleGetFunction");
+ LoadedModules.emplace(Code, std::make_pair(Module, Function));
+ } else
----------------
Hm. This makes a copy of "Code" in the map. And also, every time we do a lookup, we're going to have to compare the whole PTX strings. Which are potentially very long.
Is there no other identifier we could use as the map key?
https://reviews.llvm.org/D24619
More information about the Parallel_libs-commits
mailing list