[libc-commits] [PATCH] D147054: [libc] Begin implementing a library for the RPC server
Joseph Huber via Phabricator via libc-commits
libc-commits at lists.llvm.org
Tue Jun 13 12:33:53 PDT 2023
jhuber6 added inline comments.
================
Comment at: libc/utils/gpu/loader/nvptx/Loader.cpp:207
+ },
+ &memory_stream);
+ rpc_register_callback(
----------------
JonChesterfield wrote:
> Does this work? It looks like the same stream running the kernel is being used to provide malloc/free, and I'd expect that to deadlock
There's a test for this that's been running on https://lab.llvm.org/buildbot/#/builders/46 for a few weeks now and it hasn't deadlocked as far as I can tell. It's a completely separate stream called `memory_stream` that's just created here. The one running the kernel is just called `stream`. This requires CUDA 11.2 IIRC.
================
Comment at: libc/utils/gpu/server/Server.cpp:35
+ : num_devices(num_devices),
+ devices(std::unique_ptr<Device[]>(new Device[num_devices])),
+ reference_count(0u) {}
----------------
JonChesterfield wrote:
> Could we go with a vector of Device instead of the new+array construct?
It's a static size so a constant sized array should be more correct.
================
Comment at: libc/utils/gpu/server/Server.cpp:44
+
+static State *state;
+
----------------
JonChesterfield wrote:
> Why is this a heap allocated thing, as opposed to `static State state;` ?
It's just easier to check if it's been initialized because the pointer is nullable. We coiuld probably make it a static thing and have a flag instead if you'd like.
================
Comment at: libc/utils/gpu/server/Server.cpp:51
+
+ if (state->reference_count == std::numeric_limits<uint32_t>::max())
+ return RPC_STATUS_ERROR;
----------------
JonChesterfield wrote:
> Could make the counter 64 bit and delete the test against max as a counter >= address space size can't overflow
>
> In general the DIY reference counting is a bit odd - is there a reason this isn't a shared_ptr?
Would a shared pointer give us the same semantics? We would be allocating it multiple times and not copying it.
================
Comment at: libc/utils/gpu/server/Server.cpp:107
+ case rpc::Opcode::WRITE_TO_STDOUT: {
+ uint64_t sizes[rpc::MAX_LANE_SIZE] = {0};
+ void *strs[rpc::MAX_LANE_SIZE] = {nullptr};
----------------
JonChesterfield wrote:
> I'm still hopeful that we'll come up with a better idea than rpc::MAX_LANE_SIZE
We could use a vector and push back into it instead, or preallocate according to the size above, but this was the easiest solution.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D147054/new/
https://reviews.llvm.org/D147054
More information about the libc-commits
mailing list