[llvm-branch-commits] [llvm] [OpenMPOpt] Make parallel regions reachable from new DeviceRTL loop functions (PR #150927)

Sergio Afonso via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Fri Oct 3 08:04:00 PDT 2025


skatrak wrote:

Moving to draft because I've noticed this doesn't currently work whenever there are different calls to new DeviceRTL loop functions. The state machine rewrite optimization of OpenMPOpt causes the following code to not run properly, whereas the same code without the `unused_problematic` subroutine in it (or compiled with `-mllvm -openmp-opt-disable-state-machine-rewrite`) works:
```f90
! flang -fopenmp -fopenmp-version=52 --offload-arch=gfx1030 test.f90 && OMP_TARGET_OFFLOAD=MANDATORY ./a.out
subroutine test_subroutine(counter)
  implicit none
  integer, intent(out) :: counter
  integer :: i1, i2, n1, n2

  n1 = 100
  n2 = 50

  counter = 0
  !$omp target teams distribute reduction(+:counter)
  do i1=1, n1
    !$omp parallel do reduction(+:counter)
    do i2=1, n2
      counter = counter + 1
    end do
  end do
end subroutine

program main
  implicit none
  integer :: counter
  call test_subroutine(counter)

  ! Should print: 5000
  print '(I0)', counter
end program

subroutine foo(i)
  integer, intent(inout) :: i
end subroutine

! The presence of this unreachable function in the compilation unit causes
! the result of `test_subroutine` to be incorrect. Removing the `distribute`
! OpenMP directive avoids the problem.
subroutine unused_problematic()
  implicit none
  integer :: i

  !$omp target teams
  !$omp distribute
  do i=1, 100
    call foo(i)
  end do
  !$omp end target teams
end subroutine
```

https://github.com/llvm/llvm-project/pull/150927


More information about the llvm-branch-commits mailing list