[llvm] [Offload] Implement the remaining initial Offload API (PR #122106)
Callum Fare via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 7 09:26:10 PST 2025
================
@@ -0,0 +1,48 @@
+//===-- Memory.td - Memory definitions for Offload ---------*- tablegen -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file contains Offload API definitions related to memory allocations
+//
+//===----------------------------------------------------------------------===//
+
+def : Enum {
+ let name = "ol_alloc_type_t";
+ let desc = "Represents the type of allocation made with olMemAlloc.";
+ let etors = [
+ Etor<"HOST", "Host allocation">,
+ Etor<"DEVICE", "Device allocation">,
+ Etor<"SHARED", "Shared allocation">
+ ];
+}
+
+def : Function {
+ let name = "olMemAlloc";
+ let desc = "Creates a memory allocation on the specified device.";
+ let params = [
+ Param<"ol_device_handle_t", "Device", "handle of the device to allocate on", PARAM_IN>,
+ Param<"ol_alloc_type_t", "Type", "type of the allocation", PARAM_IN>,
+ Param<"size_t", "Size", "size of the allocation in bytes", PARAM_IN>,
+ Param<"void**", "AllocationOut", "output for the allocated pointer", PARAM_OUT>
+ ];
+ let returns = [
+ Return<"OL_ERRC_INVALID_SIZE", [
+ "`Size == 0`"
+ ]>
+ ];
+}
+
+def : Function {
+ let name = "olMemFree";
+ let desc = "Frees a memory allocation previously made by olMemAlloc.";
+ let params = [
+ Param<"ol_device_handle_t", "Device", "handle of the device to allocate on", PARAM_IN>,
----------------
callumfare wrote:
Yeah that's a good point. We do need some way of dispatching to the correct plugin and like you said it's probably not ideal to track this for all pointers.
The UR equivalent ([`urUSMFree`](https://oneapi-src.github.io/unified-runtime/core/api.html#urusmfree)) takes a context parameter, where the context is a group of one or more devices from the same platform. Host allocations belong to a context rather than any specific device.
I guess the closest we have right now is the platform. The user will always know the platform of the device that did the allocation and can keep a hold of it for the olMemFree call. But by extension that means they could also just track the device itself anyway.
I think in terms of ownership it always makes sense for the GPU to always have ownership because the pinned memory is always managed by a specific device driver and (presumably) won't be compatible between GPUs from different plugins. Likewise shared memory can't migrate between any arbitrary device.
The way `olMemAlloc` is designed, I'm not sure it's meaningful to do, for example, a `OL_ALLOC_TYPE_SHARED` allocation on the special `host` device. I'm not sure how to reconcile these different kinds and the existence of the `host` device as a valid allocation target.
Maybe we need to add the concept of a context, which contains one or more devices and has ownership of the allocations, and are restricted to containing the `host` device and one or more regular devices from a single plugin/platform. The host device could be implicit. Shared memory should always be migratable between the devices, pinned memory should be accessible on every device. Memcpy could take contexts parameters instead of device parameters, and it would still be able to handle copies across different contexts to support the use case of copies between different GPUs etc.
Fundamentally we have to track additional information that underlying APIs don't need to worry about in knowing what platform is managing the memory.
It might be possible to allow contexts to contain any devices but you'd need to be clever with handling pinned memory, and you have problems like device pointers maybe not being unique. I don't know if that would be worth the effort or even solvable.
I'm about to be away for a couple of weeks, but I think this direction is worth discussing, even if we kick the can down the road a bit and leave it for a future PR.
https://github.com/llvm/llvm-project/pull/122106
More information about the llvm-commits
mailing list