[llvm-dev] [AA] Question on the full restrict patch

Mon Nov 30 08:01:48 PST 2020

Hi Kelvin, Tarique,

I finally found some time to look into your example in more depth.

As Johannes mentioned in one of the LLVM Alias Analysis Technical calls, mapping a 'generic table' that
tracks and provides alias information for any two pointers would not be an efficient way of keeping track of the aliasing
in llvm-ir.

Reconstructing the (minimal) necessary scopes and intrinsics for a such a 'generic table' is also not an easy problem
(and not fast to solve as well).

I have been wondering on how to tackle the example that you provided. Reconstructing the optimal or a sensible 'noalias' set
by hand did not seem easy.  So I went for a different approach: I created a c++ program that should represent as close as possible
what you are describing in fortran (and the resulting llvm-ir code). There is space for improvement here, Fortran seems to be able
to provide extra alias information. But it should be a good start to show how a (hierarchical based) mapping can be done.

-------  test03.cpp
// COMPILE: ../bin/clang++ -c test03.cpp -S -O3 -ffull-restrict -emit-llvm -mllvm -unroll-max-count=0 -mllvm --interleave-loops=false

// Roughly equivalent c++ structure:

#define restrict __restrict
namespace m {
  struct N {
  int* restrict tgtA01;
  int* restrict tgtA02;
  int* restrict tgtA03;

  int* p_1_SetA;
  int* p_2_SetA;
  int* p_3_SetA;

  // invisible to others ! But not easy to represent in c++, so there is room for improvement here
  int tgtB01;
  int tgtB02;
  int tgtB03;
  } N;

  extern void init_var(int&);

  void init_ptr() {
    N.p_1_SetA = nullptr;
    N.p_2_SetA = nullptr;
    N.p_3_SetA = nullptr;
  }

  void compute(int & __restrict n) { // the 'n' parameters seems to be passed as a 'i32* noalias'
    int idx;

    init_var(N.tgtB01);
    init_var(N.tgtB02);
    init_var(N.tgtB03);

    {
      auto &assoc_tgtA01 = *N.tgtA01;
      auto &assoc_tgtA02 = *N.tgtA02;
      auto &assoc_tgtB02 = N.tgtB02;

      for (idx = 1; idx <= n; ++idx) {
        assoc_tgtA01 = assoc_tgtA01 * N.tgtB03 + assoc_tgtA02 * (*N.tgtA03) + N.tgtB01 * assoc_tgtB02;
      }
    }
  }
}
------

Compiling this code with the full restrict patches  results in similar code as you provided (no loads and stores inside the loop).
Notes:
- only two scopes are used:
-- one for 'n' (a local scope)
-- one for 'tgtA01',..,'tgtA03' (aka, the 'unknown scope', as they reside outside the function).
    Because those are three different pointers, they are treated as pointing to separate objects,
   although they do share the same scope.

Greetings,

Jeroen Dobbelaere
From: Kelvin Li <kli at ca.ibm.com>
Sent: Monday, November 23, 2020 17:32
To: Jeroen Dobbelaere <dobbel at synopsys.com>; llvm-dev at lists.llvm.org
Cc: Tarique Islam <tislam at ca.ibm.com>
Subject: [AA] Question on the full restrict patch

Hi Jeroen,

In working on a scheme to represent Fortran alias information precisely,
we encounter a situation in which two alias sets partially overlap.  We would
like to seek any idea of representing such alias sets, particular, in the scheme of
the full restrict alias approach in D68484.

Consider the following Fortran code,

module m
  implicit none
!set A: variables *with* the TARGET attribute
  integer, allocatable, target :: tgtA01, tgtA02, tgtA03
  integer, pointer :: p_1_SetA, p_2_SetA, p_3_SetA

!set B: variables *without* the TARGET attribute
  integer :: tgtB01, tgtB02, tgtB03

  contains
  subroutine init_ptr()
    p_1_SetA = 0 ; p_2_SetA = 0 ; p_3_SetA = 0
  end subroutine init_ptr

  subroutine compute(n)
    integer :: idx, n

    call init_var(tgtB01)
    call init_var(tgtB02)
    call init_var(tgtB03)

    associate(assoc_tgtA01 => tgtA01, assoc_tgtA02 => tgtA02, assoc_tgtB02 => tgtB02)
      do idx = 1, n
        assoc_tgtA01 = assoc_tgtA01 * tgtB03 + assoc_tgtA02 * tgtA03 + tgtB01 * assoc_tgtB02
      end do
    end associate
  end subroutine compute
end module

The alias analysis in frontend can generate the following alias sets for
"tgtA01" and "tgtA02".

 -------------------------------
 "tgtA01" is aliased to:
    "__m_NMOD__&&_m" "__m_NMOD_init_ptr" "__m_NMOD_compute" "init_var"
    "p_1_SetA" "p_2_SetA" "p_3_SetA"
    "assoc_tgtA01" "tgtA01"
 -------------------------------
 "tgtA02" is aliased to:
    "__m_NMOD__&&_m" "__m_NMOD_init_ptr" "__m_NMOD_compute" "init_var"
    "p_1_SetA" "p_2_SetA" "p_3_SetA"
    "assoc_tgtA02" "tgtA02"
 -------------------------------

If we express this alias info with the scheme that we presented in the last LLVM
Developers' Meeting (https://youtu.be/Vp4GklTXDys<https://urldefense.com/v3/__https:/youtu.be/Vp4GklTXDys__;!!A4F2R9G_pg!M47wpuFvIB8nYHqyIHGWMVh8KC2Qey52i3II-OppwxkfGZroge9Sd3Dsm27DZ_TCRtx4hTyK$>), the optimizer manages to
move the loop invariant code out of the loop body.  We think that the optimizer
can make use of the alias information even though there is an overlap in the two
alias sets.  With the D69542 patch (https://reviews.llvm.org/D69542<https://urldefense.com/v3/__https:/reviews.llvm.org/D69542__;!!A4F2R9G_pg!M47wpuFvIB8nYHqyIHGWMVh8KC2Qey52i3II-OppwxkfGZroge9Sd3Dsm27DZ_TCRh8Eigua$>), we would
like to know if there is any way to express that with the intrinsics proposed in
the full restrict patch.

We attach the ll files for reference.

Note: The options --unroll-max-count=0 --interleave-loops=false
-passes=default<O3> -aa-pipeline=default are used in opt.

Attachments
0. alias-sets-from-fe.txt - full alias sets generated from Fortran frontend
1. before-opt-without-alias-from-fe.ll - IR before opt (NO alias info is emitted)
2. after-opt-without-alias-from-fe.ll - IR after opt (NO alias info is emitted)
3. before-opt-with-alias-from-fe.ll - IR before opt (noalias/alias.scope metadata is emitted)
4. after-opt-with-alias-from-fe.ll - IR after opt (noalias/alias.scope metadata is emitted)

Thanks,
Kelvin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201130/84016d22/attachment.html>