[clang-tools-extra] [MLIR] Enabling Intel GPU Integration. (PR #65539)
Guray Ozen via cfe-commits
cfe-commits at lists.llvm.org
Thu Sep 7 00:15:53 PDT 2023
================
@@ -811,8 +812,13 @@ LogicalResult ConvertAllocOpToGpuRuntimeCallPattern::matchAndRewrite(
// descriptor.
Type elementPtrType = this->getElementPtrType(memRefType);
auto stream = adaptor.getAsyncDependencies().front();
+
+ auto isHostShared = rewriter.create<mlir::LLVM::ConstantOp>(
+ loc, llvmInt64Type, rewriter.getI64IntegerAttr(isShared));
+
Value allocatedPtr =
- allocCallBuilder.create(loc, rewriter, {sizeBytes, stream}).getResult();
+ allocCallBuilder.create(loc, rewriter, {sizeBytes, stream, isHostShared})
+ .getResult();
----------------
grypp wrote:
Regarding `host_shared`, I noticed this code in the examples:
```
%memref, %asyncToken = gpu.alloc async [%0] host_shared (): memref<3x3xi64>
```
Can SYCL's runtime allocate `host_shared` data asynchronously? It might be a good idea to prevent the use of `host_shared` and `async` together. FWIW, CUDA and HIP cannot do that. As far as I can see from the PR, the queue is not used when allocating `host_shared`.
Nonetheless, having `async` on `gpu.alloc` is perfectly acceptable. CUDA does support asynchronous device memory allocation.
https://github.com/llvm/llvm-project/pull/65539
More information about the cfe-commits
mailing list