[libc-commits] [libc] [MLIR] Enabling Intel GPU Integration. (PR	#65539)
    Guray Ozen via libc-commits 
    libc-commits at lists.llvm.org
       
    Sat Sep  9 04:41:00 PDT 2023
    
    
  
================
@@ -811,8 +812,13 @@ LogicalResult ConvertAllocOpToGpuRuntimeCallPattern::matchAndRewrite(
   // descriptor.
   Type elementPtrType = this->getElementPtrType(memRefType);
   auto stream = adaptor.getAsyncDependencies().front();
+
+  auto isHostShared = rewriter.create<mlir::LLVM::ConstantOp>(
+      loc, llvmInt64Type, rewriter.getI64IntegerAttr(isShared));
+
   Value allocatedPtr =
-      allocCallBuilder.create(loc, rewriter, {sizeBytes, stream}).getResult();
+      allocCallBuilder.create(loc, rewriter, {sizeBytes, stream, isHostShared})
+          .getResult();
----------------
grypp wrote:
> Technically, SYCL provides a more abstract memory management with sycl::buffer and sycl::accessor defining an implicit asynchronous task graph. The allocation details are left to the implementation, asynchronous or synchronous allocation is left to the implementers.
I haven't touched SYCL much, thanks for the explanation. Creating a task graph implicitly sounds interesting. In this case, SYCL users are ware of asynchrony while writing their program. In CUDA (or HIP), users choose sync or async execution. 
> Here the lower-level synchronous USM memory management API of SYCL is used instead, similar to CUDA/HIP memory management.
Yes that's correct. I don't think there is an USM that can do allocation asynchronously. 
> So, should the async allocation in the example be synchronous instead?
Yes, I think this is the correct behaviour. We can disallow `host_shared` and `async` on the Op. 
Here are the possible IRs: 
```
// Valid
%memref = gpu.alloc host_shared (): memref<3x3xi64>
// Valid
%memref = gpu.alloc (): memref<3x3xi64>
// Invalid, USM managers don't allocate async  
%memref, %asyncToken = gpu.alloc async [%0] host_shared (): memref<3x3xi64>
// Valid, only for CUDA. Afaik, SYCL or HIP cannot do that
%memref, %asyncToken = gpu.alloc async [%0] (): memref<3x3xi64>
```
https://github.com/llvm/llvm-project/pull/65539
    
    
More information about the libc-commits
mailing list