[llvm-dev] Delinearization validity checks in DependenceAnalysis

Wed May 15 12:21:17 PDT 2019

On 05/15, Bardia Mahjour via llvm-dev wrote:
> Regarding your example, I'm not sure I fully understand the concern and I
> would appreciate it if you could clarify that for me. My understanding is
> that if the intended shape of 'a' is in fact 'a[n][m]' then the example, as
> provided, has an out-of-bound access to start with. To avoid this out-bound
> access one would need to change the upper bound of the j-loop to be 'm-2'.
> Interestingly, if we do that, the current validity checks start to pass and
> we get [0 2] as the dependence vector. Here's a slightly modified version
> of your example that I tried:
> 
> ```
> typedef unsigned long long TT;
> void foo(TT n, TT m, int *a) {
>   for (TT i = 0; i < n; ++i)
>     for (TT j = 0; j < m-2; ++j) {
>       a[i*m + j] = a[i*m + j+2];
>     }
> }
> ```
> 
> If the concern is that the actual intended shape of the array may have been
> something other than a[n][m], and that we are indexing it such that the
> accesses are in-bound with respect to the memory of the whole array but not
> with respect to individual dimensions, then I'm not sure we can reason
> about *any* delinearization statically (except for the limited cases where
> the bounds are compile-time known).

In the example above, it works statically, doesn't it? Generally, if
loop bounds and access functions are "in-sync", it should work statically.
It does not if you do use different/multiple unknown values
(parameters) or confuse Scalar Evolution (different types).

> Am I misunderstanding the issue?
> 
> Bardia Mahjour
> Compiler Optimizations
> Hydra Squad Lead
> IBM Toronto Software Lab
> bmahjour at ca.ibm.com (905) 413-2336
> 
> 
> 
> 
> 
> From:	David Green <David.Green at arm.com>
> To:	"llvm-dev at lists.llvm.org" <llvm-dev at lists.llvm.org>, Bardia
>             Mahjour <bmahjour at ca.ibm.com>
> Cc:	nd <nd at arm.com>
> Date:	2019/05/14 02:50 PM
> Subject:	[EXTERNAL] Re: [llvm-dev] Delinearization validity checks in
>             DependenceAnalysis
> 
> 
> 
> Hello
> 
> Interestingly, the example you provide works for me so long as either it's
> a 32bit target, or the array bounds (n and m) are changed to unsigned.
> 
> For a bit of history, DA used to have a different delinearisation method
> based on geps, but it was known to be invalid in same cases and was
> eventually removed. There was no delinearisation for a while until these
> checks were fixed, enough to be correct, but not as powerful as they could
> be. I believe Pollys delinearisation is much more powerful, and can version
> loops with runtime conditions.
> 
> IIRC, the checks are there for cases like this:
> void foo(unsigned n, unsigned m, int *a) {
>   for (int i = 0; i < n; ++i)
>     for (int j = 0; j < m; ++j) {
>       a[i*m + j] = a[i*m + j+2];
>     }
> }
> 
> The "j-2" can underflow into the previous i iterations space. So the
> distance vector isn't [0 2] (and isn't even [= <]). Unfortunately this is
> perfectly valid in C (I think for the int a[][m] case too).
> 
> See this comment for why they were needed and perhaps a better way to fix
> it:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_llvm_llvm-2Dproject_commit_d143c65de3c884d09197da279d2f04f094efaf15-23diff-2D57473b376037dd3637516348b9b02556L3274&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=46eKxI_sFjjeBzn7X-OLXSEUwHN-HVCD16TF9OuyIqc&s=adPvJDhPtFMlaTWihZmvWjXqFUFHDnzcV84oaDGlryM&e=
> 
> 
> Any improvements to the delinearisation code would be most welcome.
> Dave
> 
> 
> 
> 
> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Bardia
> Mahjour via llvm-dev <llvm-dev at lists.llvm.org>
> Sent: 13 May 2019 14:48
> To: llvm-dev at lists.llvm.org
> Subject: [llvm-dev] Delinearization validity checks in DependenceAnalysis
> 
> Hi all,
> 
> I have been looking at the `DependenceAnalysis` pass in
> `llvm/include/llvm/Analysis/DependenceAnalysis.h`.
> In order for this analysis to produce accurate dependence vectors for
> multi-dimensional arrays in nested loops, it needs to "delinearize" array
> element accesses to recover the subscripts in each dimension of the array.
> I believe that the current implementation of delinearization is based on
> this paper:
> https://urldefense.proofpoint.com/v2/url?u=http-3A__web.cse.ohio-2Dstate.edu_-7Epouchet.2_doc_ics-2Darticle.15a.pdf&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=46eKxI_sFjjeBzn7X-OLXSEUwHN-HVCD16TF9OuyIqc&s=nraVp9R56UlYi0We27kmGSjZGnM296r0HFpbRR62Fzs&e=
> .
> 
> This paper describes how to delinearize the subscripts, and as a last step
> it requires certain conditions to be met in order to validate that the
> delinearized indexes are correct. The `tryDelinearize` function in
> `DependenceAnalysis.cpp` appears to be checking for cases where these
> conditions can be validated *at compile time*:
> 
> ```
> // Statically check that the array bounds are in-range. The first subscript
> we
> // don't have a size for and it cannot overflow into another subscript, so
> is
> // always safe. The others need to be 0 <= subscript[i] < bound, for both
> src
> // and dst.
> // FIXME: It may be better to record these sizes and add them as
> constraints
> // to the dependency checks.
> for (int i = 1; i < size; ++i) {
> if (!isKnownNonNegative(SrcSubscripts[i], SrcPtr))
> return false;
> 
> if (!isKnownLessThan(SrcSubscripts[i], Sizes[i - 1]))
> return false;
> 
> if (!isKnownNonNegative(DstSubscripts[i], DstPtr))
> return false;
> 
> if (!isKnownLessThan(DstSubscripts[i], Sizes[i - 1]))
> return false;
> }
> ```
> 
> The problem is that in a lot of cases these conditions cannot be proven
> statically, even though the delinearized indexes are in fact correct. For
> example consider this simple loop:
> 
> ```
> void foo(int n, int m, int a[][m]) {
> for (int i = 0; i < n-1; ++i)
> for (int j = 2; j < m; ++j) {
> a[i][j] = a[i+1][j-2];
> }
> }
> 
> clang test.c -c -O3 -S -Xclang -disable-llvm-passes -emit-llvm
> opt -mem2reg -instcombine -indvars -loop-simplify -loop-rotate -inline
> -simplifycfg test.ll -S -o test.simp.ll
> opt test.simp.ll -analyze -da
> ```
> 
> will produce:
> 
> ```
> da analyze - none!
> da analyze - anti [* *|<]!
> da analyze - output [* *]!
> ```
> 
> If the validity checks were not present, the result would be much more
> accurate dependence vector with accurate dependence distances:
> 
> ```
> da analyze - none!
> da analyze - consistent anti [1 -2]!
> da analyze - none!
> ```
> 
> In my experience the static validity checks tend to fail in many common
> cases (involving loops with symbolic bounds). Sometimes this happens
> because SCEV is not able to simplify the expressions due to presence of
> type casts and sign/zero extensions, but other times the conditions are
> just not computable at compile-time.
> 
> So far I haven't been able to come up with an example where the validity
> checks in the current implementation legitimately catch a case of invalid
> delinearization. I've also disabled these checks and run some tests without
> finding any failures (other than improvements in the dependence analysis
> LIT tests).
> 
> I also had a quick look at Polly which benefits from delinearization. From
> what I saw, a much more optimistic approach is taken where the validity
> checks seem to be avoided.
> 
> My questions are:
> 1. Does anyone have more information about why these validity checks are
> necessary in the current implementation, perhaps with some examples showing
> an incorrect delinearization that's possible without those checks?
> 2. Are there any concerns with putting these validity checks under an
> option so that we can more easily disable them and experiment? This could
> also help us improve LIT tests, since some of them have been pessimised to
> compensate for DA's inability to delinearize, and could fail to catch
> regressions as a result of bad changes to the data dependence analysis.
> 
> Looking forward to your help on this.
> 
> Thank you,
> 
> Bardia Mahjour
> Compiler Optimizations
> Hydra Squad Lead
> IBM Toronto Software Lab
> bmahjour at ca.ibm.com (905) 413-2336
> 
> 
> 
> 
> 

> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 

Johannes Doerfert
Researcher

Argonne National Laboratory
Lemont, IL 60439, USA

jdoerfert at anl.gov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190515/8180536e/attachment.sig>