[llvm-dev] GVN removing loads that are affected by call
Ryan Taylor via llvm-dev
llvm-dev at lists.llvm.org
Mon Jan 18 14:55:05 PST 2021
I'm still a bit fuzzy on the definition but that doesn't seem to completely
agree with pointer capture?
" A pointer value *is captured* if the function makes a copy of any part of
the pointer that outlives the call. "
I can replace memcpy in our example and see no issues. memcpy is no
capture.
#define SIZE 5
#include<string>
int foo(short a[SIZE], short b[SIZE], short *res) {
short temp[SIZE] = {0, 0, 0, 0, 0};
short *ptr;
ptr = temp;
memcpy(ptr, b, SIZE*2);
short t = temp[0];
for (int i = 1; i < 4; i++) {
if (t > temp[i]) t = temp[i];
}
*res = t;
memcpy(ptr, b, SIZE*2);
t = temp[0];
for (int i = 1; i < 4; i++) {
if (t > temp[i]) t = temp[i];
}
return t;
}
So you're saying that if the memcpy intrinsic, in this case, returned a
pointer (even if not directly used, ie "ptr" in this case) and had
NoCapture attribute, the second set of loads would be removed?
-Ryan
On Mon, Jan 18, 2021 at 4:30 PM Johannes Doerfert <
johannesdoerfert at gmail.com> wrote:
> A function that returns a pointer argument "captures" it for now.
>
> The Attributor uses "nocapute_maybe_returned" internally but it is not
> an enum attribute (yet).
>
>
> On 1/15/21 7:32 PM, Ryan Taylor wrote:
> > So this seems to be an issue with the intrinsic setting NoCapture<0> (it
> > returns the pointer it takes). We have a similar intrinsic that does the
> > exact same thing but doesn't return the pointer. It also has NoCapture<1>
> > and doesn't cause this issue. Also, I managed to substitute memcpy and
> > there were no issues, it also has nocapture.
> >
> > Are there some papers/citations which talk more about Pointer
> > Capture/Capture Tracking?
> >
> > I saw this blog:
> > https://jonasdevlieghere.com/escape-analysis-capture-tracking-in-llvm/
> > And I read the CaptureTracking.h short description but I'd like a clearer
> > grasp.
> >
> > Thanks,
> >
> > Ryan
> >
> >
> >
> > On Thu, Jan 14, 2021 at 2:03 PM Ryan Taylor <ryta1203 at gmail.com> wrote:
> >
> >> Ok, thanks. I'll try to come up with a more generic test case.
> >>
> >> On Thu, Jan 14, 2021 at 2:00 PM Johannes Doerfert <
> >> johannesdoerfert at gmail.com> wrote:
> >>
> >>> Nothing comes to mind given the information.
> >>>
> >>> ~ Johannes
> >>>
> >>>
> >>> On 1/14/21 12:56 PM, Ryan Taylor wrote:
> >>>> argmemonly, nounwind, writeonly
> >>>>
> >>>> -fno-strict-aliasing did not help.
> >>>>
> >>>> On Thu, Jan 14, 2021 at 1:29 PM Johannes Doerfert <
> >>>> johannesdoerfert at gmail.com> wrote:
> >>>>
> >>>>> Can you share the attributes of the intrinsic declaration,
> >>>>> assuming removing `!tbaa` didn't help.
> >>>>>
> >>>>>
> >>>>> On 1/14/21 11:43 AM, Ryan Taylor wrote:
> >>>>>> Yes, this is for downstream/out of tree target so I'm not sure how
> you
> >>>>>> could reproduce it either but I thought the IR might help a bit.
> >>>>>>
> >>>>>> My guess is it's not a bug in GVN as much as an issue with the
> >>> intrinsic
> >>>>>> properties, or lack thereof. I put this in the first post but the
> >>> Alias
> >>>>>> Sets show:
> >>>>>>
> >>>>>> AliasSet[0x75fbd0, 9] may alias, Mod/Ref Pointers: (i8* %1, 8),
> >>> (i16*
> >>>>>> %arrayidx, 2), (i16* %arrayidx1, 2), (i16* %arrayidx5, 2), (i16*
> %19,
> >>> 2),
> >>>>>> (i16* %arrayidx6, 2), (i16* %arrayidx14, 2), (i16* %arrayidx19, 2)
> >>>>>> 4 Unknown instructions: void <badref>, <4 x i16>* %6, <4 x
> i16>*
> >>>>> %22,
> >>>>>> void <badref>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Jan 14, 2021 at 12:37 PM Johannes Doerfert <
> >>>>>> johannesdoerfert at gmail.com> wrote:
> >>>>>>
> >>>>>>> There is still not enough information here.
> >>>>>>>
> >>>>>>> My first guess. The `!tbaa` annotation on the `XXX.intrinsic` and
> the
> >>>>>>> `load`
> >>>>>>> basically encode there is no alias. Easy to verify, remove the ones
> >>> on
> >>>>> the
> >>>>>>> intrinsic.
> >>>>>>>
> >>>>>>> ~ Johannes
> >>>>>>>
> >>>>>>>
> >>>>>>> P.S. If this was a bug in GVN, and I assume it is not, a reproducer
> >>>>>>> would help
> >>>>>>> a lot. So a small IR sample that shows the problem and
> which
> >>> we
> >>>>>>> can run.
> >>>>>>> This is a "redacted?" IR fragment in which I don't know
> what
> >>>>>>> transformation
> >>>>>>> is problematic. I also can not run it through GVN, which
> >>> makes it
> >>>>>>> impossible
> >>>>>>> to reproduce.
> >>>>>>>
> >>>>>>>
> >>>>>>> On 1/14/21 11:27 AM, Ryan Taylor via llvm-dev wrote:
> >>>>>>>> This is right before GVN:
> >>>>>>>>
> >>>>>>>> define i32 @foo(<4 x i16> %p, <4 x i16> %p1, i16* nocapture %res)
> >>>>>>>> local_unnamed_addr #0 !dbg !6 {
> >>>>>>>> entry:
> >>>>>>>> %temp = alloca i64, align 8
> >>>>>>>> %tmpcast = bitcast i64* %temp to [4 x i16]*
> >>>>>>>> %0 = bitcast i64* %temp to i8*, !dbg !8
> >>>>>>>> call void @llvm.lifetime.start.p0i8(i64 8, i8* nonnull %0)
> #3,
> >>>>> !dbg !8
> >>>>>>>> store i64 0, i64* %temp, align 8, !dbg !9
> >>>>>>>> %1 = bitcast i64* %temp to <4 x i16>*, !dbg !10
> >>>>>>>> %2 = call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* nonnull
> >>> %1, <4
> >>>>> x
> >>>>>>> i16>
> >>>>>>>> %p, i32 0), !dbg !11, !tbaa !12
> >>>>>>>> %arrayidx = bitcast i64* %temp to i16*, !dbg !16
> >>>>>>>> %3 = load i16, i16* %arrayidx, align 8, !dbg !16, !tbaa !17
> >>>>>>>> br label %for.body, !dbg !19
> >>>>>>>>
> >>>>>>>> for.body: ; preds = %entry
> >>>>>>>> %arrayidx1 = getelementptr inbounds [4 x i16], [4 x i16]*
> >>> %tmpcast,
> >>>>>>> i32
> >>>>>>>> 0, i32 1, !dbg !20
> >>>>>>>> %4 = load i16, i16* %arrayidx1, align 2, !dbg !20, !tbaa !17
> >>>>>>>> %cmp3 = icmp sgt i16 %3, %4, !dbg !21
> >>>>>>>> %spec.select = select i1 %cmp3, i16 %4, i16 %3, !dbg !22
> >>>>>>>> %arrayidx1.1 = getelementptr inbounds [4 x i16], [4 x i16]*
> >>>>> %tmpcast,
> >>>>>>> i32
> >>>>>>>> 0, i32 2, !dbg !20
> >>>>>>>> %5 = load i16, i16* %arrayidx1.1, align 2, !dbg !20, !tbaa
> !17
> >>>>>>>> %cmp3.1 = icmp sgt i16 %spec.select, %5, !dbg !21
> >>>>>>>> %spec.select.1 = select i1 %cmp3.1, i16 %5, i16
> %spec.select,
> >>> !dbg
> >>>>> !22
> >>>>>>>> %arrayidx1.2 = getelementptr inbounds [4 x i16], [4 x i16]*
> >>>>> %tmpcast,
> >>>>>>> i32
> >>>>>>>> 0, i32 3, !dbg !20
> >>>>>>>> %6 = load i16, i16* %arrayidx1.2, align 2, !dbg !20, !tbaa
> !17
> >>>>>>>> %cmp3.2 = icmp sgt i16 %spec.select.1, %6, !dbg !21
> >>>>>>>> %spec.select.2 = select i1 %cmp3.2, i16 %6, i16
> %spec.select.1,
> >>>>> !dbg
> >>>>>>> !22
> >>>>>>>> store i16 %spec.select.2, i16* %res, align 2, !dbg !23,
> !tbaa
> >>> !17
> >>>>>>>> %7 = tail call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* %2,
> >>> <4 x
> >>>>> i16>
> >>>>>>>> %p1, i32 0), !dbg !24, !tbaa !12
> >>>>>>>> %8 = load i16, i16* %arrayidx, align 8, !dbg !25, !tbaa !17
> >>>>>>>> br label %for.body12, !dbg !26
> >>>>>>>>
> >>>>>>>> for.body12: ; preds =
> >>> %for.body
> >>>>>>>> %arrayidx14 = getelementptr inbounds [4 x i16], [4 x i16]*
> >>>>> %tmpcast,
> >>>>>>> i32
> >>>>>>>> 0, i32 1, !dbg !27
> >>>>>>>> %9 = load i16, i16* %arrayidx14, align 2, !dbg !27, !tbaa
> !17
> >>>>>>>> %cmp16 = icmp sgt i16 %8, %9, !dbg !28
> >>>>>>>> %spec.select39 = select i1 %cmp16, i16 %9, i16 %8, !dbg !29
> >>>>>>>> %arrayidx14.1 = getelementptr inbounds [4 x i16], [4 x i16]*
> >>>>> %tmpcast,
> >>>>>>>> i32 0, i32 2, !dbg !27
> >>>>>>>> %10 = load i16, i16* %arrayidx14.1, align 2, !dbg !27, !tbaa
> >>> !17
> >>>>>>>> %cmp16.1 = icmp sgt i16 %spec.select39, %10, !dbg !28
> >>>>>>>> %spec.select39.1 = select i1 %cmp16.1, i16 %10, i16
> >>> %spec.select39,
> >>>>>>> !dbg
> >>>>>>>> !29
> >>>>>>>> %arrayidx14.2 = getelementptr inbounds [4 x i16], [4 x i16]*
> >>>>> %tmpcast,
> >>>>>>>> i32 0, i32 3, !dbg !27
> >>>>>>>> %11 = load i16, i16* %arrayidx14.2, align 2, !dbg !27, !tbaa
> >>> !17
> >>>>>>>> %cmp16.2 = icmp sgt i16 %spec.select39.1, %11, !dbg !28
> >>>>>>>> %spec.select39.2 = select i1 %cmp16.2, i16 %11, i16
> >>>>> %spec.select39.1,
> >>>>>>>> !dbg !29
> >>>>>>>> %conv24 = sext i16 %spec.select39.2 to i32, !dbg !30
> >>>>>>>> call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #3,
> >>> !dbg
> >>>>> !31
> >>>>>>>> ret i32 %conv24, !dbg !32
> >>>>>>>>
> >>>>>>>> On Thu, Jan 14, 2021 at 11:54 AM Roman Lebedev <
> >>> lebedev.ri at gmail.com>
> >>>>>>> wrote:
> >>>>>>>>> It would be good to have an actual IR reproducer here.
> >>>>>>>>>
> >>>>>>>>> On Thu, Jan 14, 2021 at 7:51 PM Ryan Taylor via llvm-dev
> >>>>>>>>> <llvm-dev at lists.llvm.org> wrote:
> >>>>>>>>>> So given an intrinsic that has a pointer as in/out and
> >>> IntrWriteMem
> >>>>>>>>> property.
> >>>>>>>>>> call intrinsic(address a, ....);
> >>>>>>>>>> loop over address a
> >>>>>>>>>> load from address a + offset
> >>>>>>>>>> call intrinsic (address a, ...);
> >>>>>>>>>> loop over address a
> >>>>>>>>>> load from address a + offset
> >>>>>>>>>>
> >>>>>>>>>> GVN is removing the second loads, despite the second call
> >>> overwriting
> >>>>>>>>> the memory starting at address a. AA has the intrinsics marked as
> >>>>>>> unknown
> >>>>>>>>> instructions but has all of these as mayAlias in a set. I'm not
> >>> seeing
> >>>>>>> this
> >>>>>>>>> issue with -fno-unroll-loops.
> >>>>>>>>>> Thanks.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> LLVM Developers mailing list
> >>>>>>>>>> llvm-dev at lists.llvm.org
> >>>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>>>>>>> _______________________________________________
> >>>>>>>> LLVM Developers mailing list
> >>>>>>>> llvm-dev at lists.llvm.org
> >>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210118/508c7f14/attachment.html>
More information about the llvm-dev
mailing list