[Mlir-commits] [mlir] [mlir][gpu] Deprecate gpu::Serialization* passes. (PR #65857)
Aart Bik
llvmlistbot at llvm.org
Mon Sep 11 16:16:41 PDT 2023
================
@@ -78,11 +78,13 @@ void mlir::sparse_tensor::buildSparseCompiler(
// Finalize GPU code generation.
if (gpuCodegen) {
-#if MLIR_GPU_TO_CUBIN_PASS_ENABLE
- pm.addNestedPass<gpu::GPUModuleOp>(createGpuSerializeToCubinPass(
- options.gpuTriple, options.gpuChip, options.gpuFeatures));
-#endif
+ GpuNVVMAttachTargetOptions nvvmTargetOptions;
+ nvvmTargetOptions.triple = options.gpuTriple;
+ nvvmTargetOptions.chip = options.gpuChip;
+ nvvmTargetOptions.features = options.gpuFeatures;
+ pm.addPass(createGpuNVVMAttachTarget(nvvmTargetOptions));
pm.addPass(createGpuToLLVMConversionPass());
+ pm.addPass(createGpuModuleToBinaryPass());
----------------
aartbik wrote:
I looked into the test, and the
'cuMemAlloc(&ptr, sizeBytes)' failed with 'CUDA_ERROR_INVALID_VALUE'
messages are indeed prompted, but seem harmless, since they just indicate that the method is called with sizeBytes == 0. I have several places in the GPU libgen path of the sparse compiler that bumps a zero size to a size of 4 just to avoid the message.
But other than that, the GPU path continues calling cuSPARSE and producing the correct result.
( ( 1, 39, 52, 0, 0, 0, 45, 51 ), ( 0, 0, 0, 0, 0, 0, 0, 0 ), ( 0, 0, 16, 0, 0, 0, 0, 0 ), ( 0, 0, 0, 25, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 36, 0, 0, 0 ), ( 0, 117, 158, 0, 0, 0, 135, 144 ), ( 0, 156, 318, 0, 0, 0, 301, 324 ), ( 0, 208, 430, 0, 0, 0, 405, 436 ) )
20
So, is the issue with cuMemAlloc(&ptr, sizeBytes) for zero size known? And should we just do something about that in the library wrapper?
@grypp WDYT?
https://github.com/llvm/llvm-project/pull/65857
More information about the Mlir-commits
mailing list