[llvm-dev] Is it feasible for LV to vectorize a loop accessing A[i] using VF>2 when A has only 2 elements?

Mon Dec 7 12:10:05 PST 2020

Haven't implemented msan for our backend/architecture yet (not
sure really if it will be feasible considering size of the DSP
etc). Maybe something that could have been caught in our simulator.

> -----Original Message-----
> From: Philip Reames <listmail at philipreames.com>
> Sent: den 7 december 2020 19:59
> To: Björn Pettersson A <bjorn.a.pettersson at ericsson.com>; llvm-dev <llvm-
> dev at lists.llvm.org>
> Subject: Re: [llvm-dev] Is it feasible for LV to vectorize a loop
> accessing A[i] using VF>2 when A has only 2 elements?
> 
> This sounds like a case where using msan would have caught your
> problem?  Did you try that approach instead of static analysis?
> 
> Philip
> 
> On 12/7/20 9:10 AM, Björn Pettersson A wrote:
> > Thanks for the answer Philip.
> >
> > Just to explain the background for the question:
> >
> > My original problem I had was that DeadStoreElimination eliminated
> > stores that was intended to be used by a memcpy. It turned out that
> > the input program was faulty as the memcpy was reading out-of-bounds
> > (although not being detected by clang, nor clang static analyzer).
> > In order to do some code washing, I wanted to try to find more files
> > that might suffer from similar out-of-bounds accesses. I did that by
> > hacking into BasicAliasAnalysis where it compares access sizes with
> > size of underlying objects to catch the problem when alias analysis
> > decided that use in the memcpy was consider as NoAlias due to
> > reading outside the underlying object (later I realized that the
> > same kind of check already exist in Lint.cpp in visitMemoryReference
> > so I could probably just have use -lint to find suspicious files).
> >
> > The annoying thing with the LoopVectorizer was that I got lots of
> > false-positives since LV actually introduce UB in some code paths.
> > I decided to disable LV when doing the code washing to get rid of
> > such false-positives.
> >
> > While it isn't a big problem for me any longer, I still think it is
> > a bit unfortunate that the vectorizer is creating a vector loop
> > with UB. No idea if that would be easy to avoid.
> >
> > /Björn
> >
> >
> >> -----Original Message-----
> >> From: Philip Reames <listmail at philipreames.com>
> >> Sent: den 5 december 2020 00:17
> >> To: Björn Pettersson A <bjorn.a.pettersson at ericsson.com>; llvm-dev
> <llvm-
> >> dev at lists.llvm.org>
> >> Subject: Re: [llvm-dev] Is it feasible for LV to vectorize a loop
> >> accessing A[i] using VF>2 when A has only 2 elements?
> >>
> >> The code the vectorizer emits should be correct for any well defined
> >> value of N.  (A program which runs this with N > 2 is full UB.)
> >>
> >> The existing code doesn't (yet) reason about out of bounds accesses
> from
> >> known object sizes as a means to imply bounds on loop induction
> >> variables.  It would be feasible to do so, it's just not something
> >> anyone has bothered with.
> >>
> >> Philip
> >>
> >> On 12/4/20 2:55 PM, Björn Pettersson A via llvm-dev wrote:
> >>> Hi!
> >>>
> >>> Consider an example like this
> >>>
> >>> int G[1000];
> >>>
> >>> void gazonk(int N) {
> >>>       int A[2] = {0};
> >>>       for (int i = 0; i < N; ++i)
> >>>         G[i] = A[i] + i;
> >>> }
> >>>
> >>>
> >>> When compiling with "-O3 -emit-llvm" ( see
> >> https://protect2.fireeye.com/v1/url?k=ac1ada1d-f3dd08b5-ac1a9a86-
> >> 86827b6aebbf-48fe48a3db5edb61&q=1&e=bbfa90b2-917e-4967-ab8e-
> >> 91a9b322c814&u=https%3A%2F%2Fgodbolt.org%2Fz%2Ff6jWfM )
> >>> the loop is vectorized with VF=4 and we get
> >>>
> >>>     %2 = alloca i64, align 8
> >>>     %3 = bitcast i64* %2 to [2 x i32]*
> >>>     ...
> >>>    vector.body:
> >>>     ...
> >>>     %24 = getelementptr inbounds [2 x i32], [2 x i32]* %3, i64 0, i64
> >> %21, !dbg !35
> >>>     %25 = bitcast i32* %24 to <4 x i32>*, !dbg !35
> >>>     %26 = load <4 x i32>, <4 x i32>* %25, align 8, !dbg !35, !tbaa
> !36
> >>>     ...
> >>>
> >>>
> >>> Loading <4 x i32> from something pointing into [2 x i32] seems like a
> >> bad thing (UB?).
> >>> And I believe that for example BasicAliasAnalysis will assume that
> the
> >> load won't alias with anything else since the size of the access is
> >> greater than the underlying object, so the code in the vector body is
> >> just crap afaict.
> >>> There are some loop guards that perhaps (hopefully) protects from
> >> running the vector body here,
> >>> but isn't it a bit weird thing to introduce such code anyway?
> >>>
> >>> BR,
> >>> Björn
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>> https://protect2.fireeye.com/v1/url?k=ec235fdb-b3b866de-ec231f40-
> 86959e472243-831610f68efd4d67&q=1&e=45cdf673-909d-4763-817e-
> 87a251821515&u=https%3A%2F%2Flists.llvm.org%2Fcgi-
> bin%2Fmailman%2Flistinfo%2Fllvm-dev