[Parallel_libs-commits] [PATCH] D24619: [SE] Cache CUDA modules

Justin Lebar via Parallel_libs-commits parallel_libs-commits at lists.llvm.org
Thu Sep 15 13:22:26 PDT 2016


jlebar added inline comments.

================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatformDevice.cpp:130
@@ +129,3 @@
+        return CUresultToError(Result, "cuModuleGetFunction");
+      LoadedModules.emplace(Code, std::make_pair(Module, Function));
+    } else
----------------
Hm.  This makes a copy of "Code" in the map.  And also, every time we do a lookup, we're going to have to compare the whole PTX strings.  Which are potentially very long.

Is there no other identifier we could use as the map key?


https://reviews.llvm.org/D24619





More information about the Parallel_libs-commits mailing list