[llvm-dev] Possible bug of Alias Analysis?
Song, Ruiling via llvm-dev
llvm-dev at lists.llvm.org
Tue Oct 17 23:10:52 PDT 2017
> -----Original Message-----
> From: meinersbur at googlemail.com [mailto:meinersbur at googlemail.com] On
> Behalf Of Michael Kruse
> Sent: Wednesday, October 18, 2017 1:18 PM
> To: Song, Ruiling <ruiling.song at intel.com>
> Cc: Michael Kruse <llvm at meinersbur.de>; llvm-dev at lists.llvm.org
> Subject: Re: Possible bug of Alias Analysis?
>
> 2017-10-18 4:48 GMT+02:00 Song, Ruiling <ruiling.song at intel.com>:
> >> -----Original Message-----
> >> From: meinersbur at googlemail.com [mailto:meinersbur at googlemail.com]
> On
> >> Behalf Of Michael Kruse
> >> Sent: Tuesday, October 17, 2017 3:26 PM
> >> To: Song, Ruiling <ruiling.song at intel.com>
> >> Cc: llvm at meinersbur.de; llvm-dev at lists.llvm.org
> >> Subject: Re: Possible bug of Alias Analysis?
> >>
> >> 2017-10-17 8:45 GMT+02:00 Song, Ruiling <ruiling.song at intel.com>:
> >> > Hi,
> >> >
> >> > I am an out-of-tree user of llvm. I am running into an regression issue
> against
> >> llvm 5.0.
> >> > The issue was introduced by "[BasicAA] Use MayAlias instead of PartialAlias
> for
> >> fallback."( https://reviews.llvm.org/D34318)
> >> > I have attached a very simple program to reproduce the issue. The
> symptom is
> >> alias analysis report NoAlias to GVN which cause GVN do wrong optimization.
> >> > The BasicAA reports MayAlias while TBAA reports NoAlias, when query the
> >> pointers of below two instructions in attached sample:
> >> > %5 = load float, float addrspace(4)* %add.ptr.i5, align 4, !tbaa !13
> >> > store i32 %3, i32* %4, align 4, ! ?tbaa !3
> >> > but in fact, they should be aliased as they are writing to/reading from the
> >> same buffer.
> >> >
> >> > you can run 'opt -S -aa -basicaa -tbaa -gvn aa-bug.ll -o -' to see what
> happens.
> >> > I am not sure if we use llvm wrong or is it a bug that we should fix in llvm?
> >>
> >> Your tbaa metadata suggests that the two locations cannot alias. Since
> >> they still alias, the metadata is wrong. It looks like you are using a
> >> plain cast between int/float pointers, which is illegal in C (6.5/7)
> >> and in OpenCL (6.1.8). Try using -fno-strict-aliasing to avoid
> >> emitting tbaa metadata.
> >
> > Thanks for the explanation. I think I get your point.
> > In fact the plain cast was introduced during expanding intrinsic llvm.memcpy().
> > As on our platform, 4byte copy is more efficient. So if we can, we will try to
> cast arguments to llvm.memcpy to 'pointer of int' type no matter what the type
> of the pointers passed in.
> > if we have to disable TBAA to solve the issue, then looks like we have to
> completely disable TBAA through the whole compilation of any OpenCL program.
> > I am not if you can share more insight on expanding the llvm.memcpy() and
> keeping the TBAA information correct?
>
> clang should not have emitted tbaa metadata for the arguments of a
> memcpy call. Besides using a union, memcpy is a supported way to copy
> bits between unrelated types (since its arguments are void* and can
> point to anything). When converting to load/store, these should point
> to void*'s metadata ("omnipotent char").
>
> That is, I'd conclude that either the load/store do not originate from
> a memcpy, or there is a serious bug.
May be I need to tell you how we handle @llvm.memcpy in details for you to clearly understand the reason behind the issue.
We provide a pre-built library function declared as __custom_memcpy(char *dst, char*src). This function was compiled into library.bc.
Inside the __custom_memcpy(char* dst, char *src), we need to cast the pointer to int* to get optimal IO performance. Which breaks TBAA.
And user provided OpenCL program was compiled as another module named opencl.bc. Then we link opencl.bc and library.bc together.
Then we use a custom lowering pass which directly replace intrinsic @llvm.memcpy into a call to the library function __custom_memcpy().
And after that a function-inline pass was running over the module. Through this way, we successfully lowered @llvm.memcpy into store/load instructions.
I am not sure whether I have described the process clear?
So based on your message, "when converting to load/store. They should point to ("omnipotent char")."
As we are using 'int *' to get optimal performance when lowering llvm.memcpy to load/store.
Do you mean that the tbaa for that 'int *' load/store should point to ("omnipotent char")." Am I right?
Ruiling
>
> Michael
More information about the llvm-dev
mailing list