[flang-commits] [flang] [flang][cuda][openacc] Reject UseDevice actual against managed/unified dummy (PR #196428)

Zhen Wang via flang-commits flang-commits at lists.llvm.org
Thu May 7 14:48:40 PDT 2026


https://github.com/wangzpgi created https://github.com/llvm/llvm-project/pull/196428

After #195182 introduced the `UseDevice` attribute, a `use_device(...)` actual was treated as compatible with **any** dummy attribute. Combined with the matching distance returning ∞ for `UseDevice → managed/unified`, this caused generic resolution to misreport a clean "no match" as an **ambiguity** when only managed/unified specifics existed.

This PR tightens `AreCompatibleCUDADataAttrs`: a `UseDevice` actual is only compatible with a `Device` dummy or a host (no-attribute) dummy. Other attributes (`Managed`, `Unified`, `Pinned`, ...) require their actual to live in that specific kind of memory.

>From ac57b1f540e1c1520191d1650e318336b16b7d5c Mon Sep 17 00:00:00 2001
From: Zhen Wang <zhenw at nvidia.com>
Date: Thu, 7 May 2026 13:21:58 -0700
Subject: [PATCH 1/2] Reject UseDevice actual against managed/unified dummy

---
 flang/lib/Semantics/expression.cpp | 11 ++---------
 flang/test/Semantics/cuf27.cuf     | 31 ++++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp
index 8e09e6440a0b7..066ead7fc28e8 100644
--- a/flang/lib/Semantics/expression.cpp
+++ b/flang/lib/Semantics/expression.cpp
@@ -2943,20 +2943,13 @@ static int GetMatchingDistance(const common::LanguageFeatureControl &features,
   // An actual argument with the UseDevice attribute comes from an OpenACC
   // host_data use_device clause: the variable itself is host-resident, but
   // inside the host_data region it is referenced via its device address.
-  // It can therefore match either a host dummy or a device dummy in generic
-  // resolution. The matching distance disambiguates when both kinds of
-  // specifics exist:
-  //   - device dummy:           0 (best match: actual carries a device address)
-  //   - managed/unified dummy:  2 (acceptable: dummy is reachable from device)
-  //   - host dummy (no attr):   3 (acceptable: underlying variable is host)
+  // It matches a Device dummy with distance 0, a host dummy (no attribute)
+  // with distance 3, and is incompatible with any other dummy attribute.
   if (actualDataAttr && *actualDataAttr == common::CUDADataAttr::UseDevice) {
     if (!dummyDataAttr)
       return 3;
     if (*dummyDataAttr == common::CUDADataAttr::Device)
       return 0;
-    if (*dummyDataAttr == common::CUDADataAttr::Managed ||
-        *dummyDataAttr == common::CUDADataAttr::Unified)
-      return 2;
   }
   return cudaInfMatchingValue;
 }
diff --git a/flang/test/Semantics/cuf27.cuf b/flang/test/Semantics/cuf27.cuf
index a3312f86247ab..62c2ace292997 100644
--- a/flang/test/Semantics/cuf27.cuf
+++ b/flang/test/Semantics/cuf27.cuf
@@ -83,3 +83,34 @@ subroutine test_use_device_pinned_use_assoc()
   !$acc exit data delete(vkb)
   deallocate(vkb, c_d, v_d)
 end
+
+! When the generic offers only managed/unified specifics (no device or host
+! specific), a use_device actual must not match either: managed and unified
+! dummies require their actual to live in managed/unified memory, which a
+! use_device actual does not. The call should be rejected.
+module m4
+  interface overl_mu
+    module procedure overl_managed
+    module procedure overl_unified
+  end interface
+contains
+  subroutine overl_managed(x)
+    integer, managed :: x(:)
+  end subroutine
+  subroutine overl_unified(x)
+    integer, unified :: x(:)
+  end subroutine
+end module m4
+
+subroutine test_use_device_managed_unified_only()
+  use m4
+  integer, allocatable :: fx(:)
+  allocate(fx(100))
+  !$acc data copy(fx)
+  !$acc host_data use_device(fx)
+  !ERROR: No specific subroutine of generic 'overl_mu' matches the actual arguments
+  call overl_mu(fx)
+  !$acc end host_data
+  !$acc end data
+  deallocate(fx)
+end

>From 717335241f49466833095761c43ed36bd926b08c Mon Sep 17 00:00:00 2001
From: Zhen Wang <zhenw at nvidia.com>
Date: Thu, 7 May 2026 14:45:20 -0700
Subject: [PATCH 2/2] =?UTF-8?q?Tighten=20compatibility=20=E2=80=94=20UseDe?=
 =?UTF-8?q?vice=20is=20only=20compatible=20with=20Device=20or=20no-attr=20?=
 =?UTF-8?q?(host)=20dummies.?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 flang/lib/Support/Fortran.cpp | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/flang/lib/Support/Fortran.cpp b/flang/lib/Support/Fortran.cpp
index 267e227529332..d38e7dc051562 100644
--- a/flang/lib/Support/Fortran.cpp
+++ b/flang/lib/Support/Fortran.cpp
@@ -117,9 +117,11 @@ bool AreCompatibleCUDADataAttrs(std::optional<CUDADataAttr> x,
   if (ignoreTKR.test(common::IgnoreTKR::Device)) {
     return true;
   }
-  // A use_device(...) actual is compatible with any dummy.
+  // A use_device(...) actual is compatible only with a Device dummy or a
+  // host dummy (no CUDA attribute); other attributes (Managed, Unified,
+  // Pinned, ...) require the actual to live in that specific kind of memory.
   if (y && *y == CUDADataAttr::UseDevice)
-    return true;
+    return !x || *x == CUDADataAttr::Device;
   if (!y && isHostDeviceProcedure) {
     return true;
   }



More information about the flang-commits mailing list