[flang-commits] [flang] [flang][AA] Relax TARGET handling in getCallModRef for local variables (PR #199869)
via flang-commits
flang-commits at lists.llvm.org
Wed May 27 08:46:08 PDT 2026
================
@@ -806,6 +806,73 @@ static bool isSavedLocal(const fir::AliasAnalysis::Source &src) {
return false;
}
+bool AliasAnalysis::mayBeCapturedBefore(mlir::Operation *declareOp,
+ mlir::Operation *op) {
+ if (!declareOp || !op)
+ return true;
+ auto funcOp = op->getParentOfType<mlir::FunctionOpInterface>();
+ if (!funcOp)
+ return true;
+ mlir::Operation *callAnchor = op;
+ while (callAnchor->getParentOp() && callAnchor->getParentOp() != funcOp)
+ callAnchor = callAnchor->getParentOp();
+
+ llvm::SmallVector<mlir::Value, 8> worklist;
+ llvm::SmallPtrSet<mlir::Value, 8> seen;
+ for (mlir::Value res : declareOp->getResults())
+ if (seen.insert(res).second)
+ worklist.push_back(res);
+
+ while (!worklist.empty()) {
+ mlir::Value v = worklist.pop_back_val();
+ for (mlir::OpOperand &use : v.getUses()) {
+ mlir::Operation *userOp = use.getOwner();
+ if (userOp == op || userOp == callAnchor)
+ continue;
+ if (userOp->getBlock() == callAnchor->getBlock() &&
+ callAnchor->isBeforeInBlock(userOp))
+ continue;
+ // OpenACC / OpenMP data-clause ops only describe device staging of the
+ // host variable; they do not bind the host address into Fortran-visible
+ // pointer state. Check before the generic view-propagation path because
+ // some acc data-clause ops also implement FortranObjectViewOpInterface
+ // (their device-pointer result is not a host address-aliasing view).
+ if (mlir::isa<ACC_DATA_CLAUSE_OPS, mlir::acc::ReductionInitOp,
+ mlir::omp::MapInfoOp>(userOp))
+ continue;
----------------
jeanPerier wrote:
To me, it is bit tricky to say that it is not captured in the sense that the acc runtime calls emitted for the data mapping will capture the address in the host data/map and use it. Also, on systems with unified memory, the input and output of a copyin are technically the same, so any capture of the device is a capture of the host address too.
Here is an example where that shows that I do not think it is OK to assume that capture of TARGET made on the device do not count as host capture for unified memory system:
```
module mymod
contains
subroutine modifies_a_indirectly(p, n)
integer, pointer :: p(:)
integer :: n
! Update values of the target indirectly via the pointer
!$acc parallel loop present(p)
do i = 1, n
p(i) = i
end do
!$acc end parallel
end subroutine
end module
program device_assoc_test
use mymod, only: modifies_a_indirectly
implicit none
integer, parameter :: n = 64
integer, target :: a(n)
integer, pointer :: p(:)
integer :: i, errors
a = -1
nullify(p)
! Associate TARGET to POINTER in device code.
!$acc enter data copyin(a)
!$acc enter data create(p)
! Region 1: associate on the device only
!$acc serial present(a, p)
p => a
!$acc end serial
! It cannot be considered that this call does not affect "a" on unified memory systems.
call modifies_a_indirectly(p, n)
errors = count(a /= [(i, i=1,n)])
if (errors == 0) then
print *, "PASS"
else
print *, "FAIL:", errors, " mismatches"
end if
!$acc exit data delete(a, p)
end program
```
It would be invalid for optimizations to assume that `modifies_a_indirectly` does not affect `a` when compiled with `-gpu=unified`.
Also, we should make sure that if we are analyzing code inside the acc region and walked back to a declare/allocation outside of it for the capture analysis, we are not completely ignoring what is happening to the device data that may be captured by device code before the device call we are looking at (that is, we should check that `op` does not belong to code inside the related acc region).
All that to say, I think you have to go and analyze the capture of the device address too to cover for unified memory systems.
https://github.com/llvm/llvm-project/pull/199869
More information about the flang-commits
mailing list