[flang-commits] [flang] [flang][cuda] Don't allocate managed descriptors for OpenACC data-clause boxes (PR #201957)

Sun Jun 7 06:43:34 PDT 2026

================
@@ -2099,11 +2099,37 @@ struct EmboxOpConversion : public EmboxCommonConversion<fir::EmboxOp> {
   }
 };
 
-/// Return true if any user of \p val is a gpu.launch_func operation,
-/// indicating the descriptor must be in device-accessible memory.
-static bool isUsedByGPULaunchFunc(mlir::Value val) {
+/// Return true if any user of \p val is a CUDA Fortran kernel launch
+/// (a gpu.launch_func carrying the `cuf.proc_attr` attribute set by
+/// CUFLaunchAttachAttr). For such launches the descriptor is passed by
+/// reference to the device kernel, so its storage must be device-accessible
+/// (managed) memory.
+///
+/// OpenACC kernel launches are also represented as gpu.launch_func at this
+/// point, but their box arguments are handled through the OpenACC data-clause
+/// lowering and do not require a device-accessible descriptor; forcing those
+/// into managed memory is unnecessary and leaks the descriptor (it is never
+/// freed). Hence only CUDA Fortran launches are considered here.
+static bool isUsedByCUFKernelLaunch(mlir::Value val) {
----------------
khaki3 wrote:

You are right. The `isDeviceAllocation` and `isUsedByOpenACCDataClause` checks are enough after `isUsedByGPULaunchFunc` to decide whether a managed descriptor should be used. `isUsedByCUFKernelLaunch` allows us to know if it's a CUF case, but that's not necessary because we pass any descriptor to gpu.launch_func. I removed it`.

https://github.com/llvm/llvm-project/pull/201957