<div dir="auto"><div>Yep, let me work on adding the new attribute.</div><div dir="auto"><br></div><div dir="auto">Juneyoung<br><div class="gmail_quote" dir="auto"><div dir="ltr" class="gmail_attr"><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
On 1/18/21 5:38 PM, Ryan Taylor wrote:<br>
> Johannes,<br>
><br>
> Thanks, this makes it a bit more clear, yes. I was a bit confused if it<br>
> was copying the pointer or the data being pointed to. It looks like you are<br>
> proposing the new attribute to fill this hole that the compiler is not<br>
> currently able to tell that this pointer, though "escaped" is not used<br>
> again and is therefore functionally the same as not captured?<br>
<br>
The pointer is not "captured" in the general sense but it is potentially<br>
returned. That allows you to keep the "nocapture" property if you follow<br>
the return value uses as if they might be the pointer itself.<br>
<br>
<br>
> Do you have a phab for the recently proposed no-capture-maybe-returned<br>
> attribute?<br>
<br>
No and I don't have code that makes use of it outside the Attributor.<br>
<br>
@Juneyoung expressed interest in the attribute recently. He, or maybe<br>
you, could write the RFC, add the enum attribute pluming, and then uses<br>
in transformations. We could use the Attributor to do some preliminary<br>
testing but it would be best to play around with the usage of this new<br>
idea in other transformations as well.<br>
<br>
~ Johannes<br>
<br>
<br>
> Thanks.<br>
><br>
><br>
><br>
> On Mon, Jan 18, 2021 at 6:13 PM Johannes Doerfert <<br>
> <a href="mailto:johannesdoerfert@gmail.com" target="_blank" rel="noreferrer">johannesdoerfert@gmail.com</a>> wrote:<br>
><br>
>> Your example/trouble starts with an "identified function local" (for<br>
>> short, IFL) pointer<br>
>> (see llvm::isIdentifiedFunctionLocal), i.a., as it falls out of<br>
>> `malloc`, `alloca`, or<br>
>> a `__restrict` qualified argument.<br>
>><br>
>> Since no access on a pointer not derived from the IFL one can cause the<br>
>> underlying memory to be altered,<br>
>> we can ignore other accesses *iff* we can show they don't use a pointer<br>
>> that is "based-on" the IFL one.<br>
>> The easiest way to prove it is not based on the IFL one is to show the<br>
>> IFL pointer does not escape since<br>
>> then every based-on use can be identified by def-use chains. A call<br>
>> argument that is nocapture guarantees<br>
>> no copy of the pointer is made that outlives the call. If you return the<br>
>> pointer, a copy is made and you<br>
>> broke the contract. We consequently "ignore" accesses through that<br>
>> pointer when we argue about the IFL one<br>
>> as they cannot alias.<br>
>><br>
>> Since we often return a pointer argument, the Attributor uses<br>
>> "no-capture-maybe-returned"<br>
>> which we recently discussed to propose as a new enum attribute. You<br>
>> could then mark the argument as such<br>
>> and capture + use traversals could continue with the intrinsic value to<br>
>> determine all accesses to the IFL.<br>
>><br>
>> I hope this makes it somewhat clearer.<br>
>><br>
>> WRT. memcpy, it does not copy the pointers you put it but the memory<br>
>> they point to. The address of the memory<br>
>> did not escape/was not captured.<br>
>><br>
>> ~ Johannes<br>
>><br>
>><br>
>> On 1/18/21 4:55 PM, Ryan Taylor wrote:<br>
>>> I'm still a bit fuzzy on the definition but that doesn't seem to<br>
>> completely<br>
>>> agree with pointer capture?<br>
>>><br>
>>> " A pointer value *is captured* if the function makes a copy of any part<br>
>> of<br>
>>> the pointer that outlives the call. "<br>
>>><br>
>>> I can replace memcpy in our example and see no issues. memcpy is no<br>
>>> capture.<br>
>>><br>
>>> #define SIZE 5<br>
>>> #include<string><br>
>>> int foo(short a[SIZE], short b[SIZE], short *res) {<br>
>>> short temp[SIZE] = {0, 0, 0, 0, 0};<br>
>>> short *ptr;<br>
>>> ptr = temp;<br>
>>> memcpy(ptr, b, SIZE*2);<br>
>>> short t = temp[0];<br>
>>> for (int i = 1; i < 4; i++) {<br>
>>> if (t > temp[i]) t = temp[i];<br>
>>> }<br>
>>> *res = t;<br>
>>> memcpy(ptr, b, SIZE*2);<br>
>>> t = temp[0];<br>
>>> for (int i = 1; i < 4; i++) {<br>
>>> if (t > temp[i]) t = temp[i];<br>
>>> }<br>
>>> return t;<br>
>>> }<br>
>>><br>
>>> So you're saying that if the memcpy intrinsic, in this case, returned a<br>
>>> pointer (even if not directly used, ie "ptr" in this case) and had<br>
>>> NoCapture attribute, the second set of loads would be removed?<br>
>>><br>
>>> -Ryan<br>
>>><br>
>>> On Mon, Jan 18, 2021 at 4:30 PM Johannes Doerfert <<br>
>>> <a href="mailto:johannesdoerfert@gmail.com" target="_blank" rel="noreferrer">johannesdoerfert@gmail.com</a>> wrote:<br>
>>><br>
>>>> A function that returns a pointer argument "captures" it for now.<br>
>>>><br>
>>>> The Attributor uses "nocapute_maybe_returned" internally but it is not<br>
>>>> an enum attribute (yet).<br>
>>>><br>
>>>><br>
>>>> On 1/15/21 7:32 PM, Ryan Taylor wrote:<br>
>>>>> So this seems to be an issue with the intrinsic setting NoCapture<0><br>
>> (it<br>
>>>>> returns the pointer it takes). We have a similar intrinsic that does<br>
>> the<br>
>>>>> exact same thing but doesn't return the pointer. It also has<br>
>> NoCapture<1><br>
>>>>> and doesn't cause this issue. Also, I managed to substitute memcpy and<br>
>>>>> there were no issues, it also has nocapture.<br>
>>>>><br>
>>>>> Are there some papers/citations which talk more about Pointer<br>
>>>>> Capture/Capture Tracking?<br>
>>>>><br>
>>>>> I saw this blog:<br>
>>>>> <a href="https://jonasdevlieghere.com/escape-analysis-capture-tracking-in-llvm/" rel="noreferrer noreferrer" target="_blank">https://jonasdevlieghere.com/escape-analysis-capture-tracking-in-llvm/</a><br>
>>>>> And I read the CaptureTracking.h short description but I'd like a<br>
>> clearer<br>
>>>>> grasp.<br>
>>>>><br>
>>>>> Thanks,<br>
>>>>><br>
>>>>> Ryan<br>
>>>>><br>
>>>>><br>
>>>>><br>
>>>>> On Thu, Jan 14, 2021 at 2:03 PM Ryan Taylor <<a href="mailto:ryta1203@gmail.com" target="_blank" rel="noreferrer">ryta1203@gmail.com</a>><br>
>> wrote:<br>
>>>>>> Ok, thanks. I'll try to come up with a more generic test case.<br>
>>>>>><br>
>>>>>> On Thu, Jan 14, 2021 at 2:00 PM Johannes Doerfert <<br>
>>>>>> <a href="mailto:johannesdoerfert@gmail.com" target="_blank" rel="noreferrer">johannesdoerfert@gmail.com</a>> wrote:<br>
>>>>>><br>
>>>>>>> Nothing comes to mind given the information.<br>
>>>>>>><br>
>>>>>>> ~ Johannes<br>
>>>>>>><br>
>>>>>>><br>
>>>>>>> On 1/14/21 12:56 PM, Ryan Taylor wrote:<br>
>>>>>>>> argmemonly, nounwind, writeonly<br>
>>>>>>>><br>
>>>>>>>> -fno-strict-aliasing did not help.<br>
>>>>>>>><br>
>>>>>>>> On Thu, Jan 14, 2021 at 1:29 PM Johannes Doerfert <<br>
>>>>>>>> <a href="mailto:johannesdoerfert@gmail.com" target="_blank" rel="noreferrer">johannesdoerfert@gmail.com</a>> wrote:<br>
>>>>>>>><br>
>>>>>>>>> Can you share the attributes of the intrinsic declaration,<br>
>>>>>>>>> assuming removing `!tbaa` didn't help.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> On 1/14/21 11:43 AM, Ryan Taylor wrote:<br>
>>>>>>>>>> Yes, this is for downstream/out of tree target so I'm not sure how<br>
>>>> you<br>
>>>>>>>>>> could reproduce it either but I thought the IR might help a bit.<br>
>>>>>>>>>><br>
>>>>>>>>>> My guess is it's not a bug in GVN as much as an issue with the<br>
>>>>>>> intrinsic<br>
>>>>>>>>>> properties, or lack thereof. I put this in the first post but the<br>
>>>>>>> Alias<br>
>>>>>>>>>> Sets show:<br>
>>>>>>>>>><br>
>>>>>>>>>> AliasSet[0x75fbd0, 9] may alias, Mod/Ref Pointers: (i8* %1, 8),<br>
>>>>>>> (i16*<br>
>>>>>>>>>> %arrayidx, 2), (i16* %arrayidx1, 2), (i16* %arrayidx5, 2), (i16*<br>
>>>> %19,<br>
>>>>>>> 2),<br>
>>>>>>>>>> (i16* %arrayidx6, 2), (i16* %arrayidx14, 2), (i16* %arrayidx19, 2)<br>
>>>>>>>>>> 4 Unknown instructions: void <badref>, <4 x i16>* %6, <4 x<br>
>>>> i16>*<br>
>>>>>>>>> %22,<br>
>>>>>>>>>> void <badref><br>
>>>>>>>>>><br>
>>>>>>>>>><br>
>>>>>>>>>><br>
>>>>>>>>>><br>
>>>>>>>>>><br>
>>>>>>>>>> On Thu, Jan 14, 2021 at 12:37 PM Johannes Doerfert <<br>
>>>>>>>>>> <a href="mailto:johannesdoerfert@gmail.com" target="_blank" rel="noreferrer">johannesdoerfert@gmail.com</a>> wrote:<br>
>>>>>>>>>><br>
>>>>>>>>>>> There is still not enough information here.<br>
>>>>>>>>>>><br>
>>>>>>>>>>> My first guess. The `!tbaa` annotation on the `XXX.intrinsic` and<br>
>>>> the<br>
>>>>>>>>>>> `load`<br>
>>>>>>>>>>> basically encode there is no alias. Easy to verify, remove the<br>
>> ones<br>
>>>>>>> on<br>
>>>>>>>>> the<br>
>>>>>>>>>>> intrinsic.<br>
>>>>>>>>>>><br>
>>>>>>>>>>> ~ Johannes<br>
>>>>>>>>>>><br>
>>>>>>>>>>><br>
>>>>>>>>>>> P.S. If this was a bug in GVN, and I assume it is not, a<br>
>> reproducer<br>
>>>>>>>>>>> would help<br>
>>>>>>>>>>> a lot. So a small IR sample that shows the problem and<br>
>>>> which<br>
>>>>>>> we<br>
>>>>>>>>>>> can run.<br>
>>>>>>>>>>> This is a "redacted?" IR fragment in which I don't know<br>
>>>> what<br>
>>>>>>>>>>> transformation<br>
>>>>>>>>>>> is problematic. I also can not run it through GVN,<br>
>> which<br>
>>>>>>> makes it<br>
>>>>>>>>>>> impossible<br>
>>>>>>>>>>> to reproduce.<br>
>>>>>>>>>>><br>
>>>>>>>>>>><br>
>>>>>>>>>>> On 1/14/21 11:27 AM, Ryan Taylor via llvm-dev wrote:<br>
>>>>>>>>>>>> This is right before GVN:<br>
>>>>>>>>>>>><br>
>>>>>>>>>>>> define i32 @foo(<4 x i16> %p, <4 x i16> %p1, i16* nocapture<br>
>> %res)<br>
>>>>>>>>>>>> local_unnamed_addr #0 !dbg !6 {<br>
>>>>>>>>>>>> entry:<br>
>>>>>>>>>>>> %temp = alloca i64, align 8<br>
>>>>>>>>>>>> %tmpcast = bitcast i64* %temp to [4 x i16]*<br>
>>>>>>>>>>>> %0 = bitcast i64* %temp to i8*, !dbg !8<br>
>>>>>>>>>>>> call void @llvm.lifetime.start.p0i8(i64 8, i8* nonnull<br>
>> %0)<br>
>>>> #3,<br>
>>>>>>>>> !dbg !8<br>
>>>>>>>>>>>> store i64 0, i64* %temp, align 8, !dbg !9<br>
>>>>>>>>>>>> %1 = bitcast i64* %temp to <4 x i16>*, !dbg !10<br>
>>>>>>>>>>>> %2 = call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>*<br>
>> nonnull<br>
>>>>>>> %1, <4<br>
>>>>>>>>> x<br>
>>>>>>>>>>> i16><br>
>>>>>>>>>>>> %p, i32 0), !dbg !11, !tbaa !12<br>
>>>>>>>>>>>> %arrayidx = bitcast i64* %temp to i16*, !dbg !16<br>
>>>>>>>>>>>> %3 = load i16, i16* %arrayidx, align 8, !dbg !16, !tbaa<br>
>> !17<br>
>>>>>>>>>>>> br label %for.body, !dbg !19<br>
>>>>>>>>>>>><br>
>>>>>>>>>>>> for.body: ; preds =<br>
>> %entry<br>
>>>>>>>>>>>> %arrayidx1 = getelementptr inbounds [4 x i16], [4 x i16]*<br>
>>>>>>> %tmpcast,<br>
>>>>>>>>>>> i32<br>
>>>>>>>>>>>> 0, i32 1, !dbg !20<br>
>>>>>>>>>>>> %4 = load i16, i16* %arrayidx1, align 2, !dbg !20, !tbaa<br>
>> !17<br>
>>>>>>>>>>>> %cmp3 = icmp sgt i16 %3, %4, !dbg !21<br>
>>>>>>>>>>>> %spec.select = select i1 %cmp3, i16 %4, i16 %3, !dbg !22<br>
>>>>>>>>>>>> %arrayidx1.1 = getelementptr inbounds [4 x i16], [4 x<br>
>> i16]*<br>
>>>>>>>>> %tmpcast,<br>
>>>>>>>>>>> i32<br>
>>>>>>>>>>>> 0, i32 2, !dbg !20<br>
>>>>>>>>>>>> %5 = load i16, i16* %arrayidx1.1, align 2, !dbg !20,<br>
>> !tbaa<br>
>>>> !17<br>
>>>>>>>>>>>> %cmp3.1 = icmp sgt i16 %spec.select, %5, !dbg !21<br>
>>>>>>>>>>>> %spec.select.1 = select i1 %cmp3.1, i16 %5, i16<br>
>>>> %spec.select,<br>
>>>>>>> !dbg<br>
>>>>>>>>> !22<br>
>>>>>>>>>>>> %arrayidx1.2 = getelementptr inbounds [4 x i16], [4 x<br>
>> i16]*<br>
>>>>>>>>> %tmpcast,<br>
>>>>>>>>>>> i32<br>
>>>>>>>>>>>> 0, i32 3, !dbg !20<br>
>>>>>>>>>>>> %6 = load i16, i16* %arrayidx1.2, align 2, !dbg !20,<br>
>> !tbaa<br>
>>>> !17<br>
>>>>>>>>>>>> %cmp3.2 = icmp sgt i16 %spec.select.1, %6, !dbg !21<br>
>>>>>>>>>>>> %spec.select.2 = select i1 %cmp3.2, i16 %6, i16<br>
>>>> %spec.select.1,<br>
>>>>>>>>> !dbg<br>
>>>>>>>>>>> !22<br>
>>>>>>>>>>>> store i16 %spec.select.2, i16* %res, align 2, !dbg !23,<br>
>>>> !tbaa<br>
>>>>>>> !17<br>
>>>>>>>>>>>> %7 = tail call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>*<br>
>> %2,<br>
>>>>>>> <4 x<br>
>>>>>>>>> i16><br>
>>>>>>>>>>>> %p1, i32 0), !dbg !24, !tbaa !12<br>
>>>>>>>>>>>> %8 = load i16, i16* %arrayidx, align 8, !dbg !25, !tbaa<br>
>> !17<br>
>>>>>>>>>>>> br label %for.body12, !dbg !26<br>
>>>>>>>>>>>><br>
>>>>>>>>>>>> for.body12: ; preds =<br>
>>>>>>> %for.body<br>
>>>>>>>>>>>> %arrayidx14 = getelementptr inbounds [4 x i16], [4 x<br>
>> i16]*<br>
>>>>>>>>> %tmpcast,<br>
>>>>>>>>>>> i32<br>
>>>>>>>>>>>> 0, i32 1, !dbg !27<br>
>>>>>>>>>>>> %9 = load i16, i16* %arrayidx14, align 2, !dbg !27, !tbaa<br>
>>>> !17<br>
>>>>>>>>>>>> %cmp16 = icmp sgt i16 %8, %9, !dbg !28<br>
>>>>>>>>>>>> %spec.select39 = select i1 %cmp16, i16 %9, i16 %8, !dbg<br>
>> !29<br>
>>>>>>>>>>>> %arrayidx14.1 = getelementptr inbounds [4 x i16], [4 x<br>
>> i16]*<br>
>>>>>>>>> %tmpcast,<br>
>>>>>>>>>>>> i32 0, i32 2, !dbg !27<br>
>>>>>>>>>>>> %10 = load i16, i16* %arrayidx14.1, align 2, !dbg !27,<br>
>> !tbaa<br>
>>>>>>> !17<br>
>>>>>>>>>>>> %cmp16.1 = icmp sgt i16 %spec.select39, %10, !dbg !28<br>
>>>>>>>>>>>> %spec.select39.1 = select i1 %cmp16.1, i16 %10, i16<br>
>>>>>>> %spec.select39,<br>
>>>>>>>>>>> !dbg<br>
>>>>>>>>>>>> !29<br>
>>>>>>>>>>>> %arrayidx14.2 = getelementptr inbounds [4 x i16], [4 x<br>
>> i16]*<br>
>>>>>>>>> %tmpcast,<br>
>>>>>>>>>>>> i32 0, i32 3, !dbg !27<br>
>>>>>>>>>>>> %11 = load i16, i16* %arrayidx14.2, align 2, !dbg !27,<br>
>> !tbaa<br>
>>>>>>> !17<br>
>>>>>>>>>>>> %cmp16.2 = icmp sgt i16 %spec.select39.1, %11, !dbg !28<br>
>>>>>>>>>>>> %spec.select39.2 = select i1 %cmp16.2, i16 %11, i16<br>
>>>>>>>>> %spec.select39.1,<br>
>>>>>>>>>>>> !dbg !29<br>
>>>>>>>>>>>> %conv24 = sext i16 %spec.select39.2 to i32, !dbg !30<br>
>>>>>>>>>>>> call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0)<br>
>> #3,<br>
>>>>>>> !dbg<br>
>>>>>>>>> !31<br>
>>>>>>>>>>>> ret i32 %conv24, !dbg !32<br>
>>>>>>>>>>>><br>
>>>>>>>>>>>> On Thu, Jan 14, 2021 at 11:54 AM Roman Lebedev <<br>
>>>>>>> <a href="mailto:lebedev.ri@gmail.com" target="_blank" rel="noreferrer">lebedev.ri@gmail.com</a>><br>
>>>>>>>>>>> wrote:<br>
>>>>>>>>>>>>> It would be good to have an actual IR reproducer here.<br>
>>>>>>>>>>>>><br>
>>>>>>>>>>>>> On Thu, Jan 14, 2021 at 7:51 PM Ryan Taylor via llvm-dev<br>
>>>>>>>>>>>>> <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" rel="noreferrer">llvm-dev@lists.llvm.org</a>> wrote:<br>
>>>>>>>>>>>>>> So given an intrinsic that has a pointer as in/out and<br>
>>>>>>> IntrWriteMem<br>
>>>>>>>>>>>>> property.<br>
>>>>>>>>>>>>>> call intrinsic(address a, ....);<br>
>>>>>>>>>>>>>> loop over address a<br>
>>>>>>>>>>>>>> load from address a + offset<br>
>>>>>>>>>>>>>> call intrinsic (address a, ...);<br>
>>>>>>>>>>>>>> loop over address a<br>
>>>>>>>>>>>>>> load from address a + offset<br>
>>>>>>>>>>>>>><br>
>>>>>>>>>>>>>> GVN is removing the second loads, despite the second call<br>
>>>>>>> overwriting<br>
>>>>>>>>>>>>> the memory starting at address a. AA has the intrinsics marked<br>
>> as<br>
>>>>>>>>>>> unknown<br>
>>>>>>>>>>>>> instructions but has all of these as mayAlias in a set. I'm not<br>
>>>>>>> seeing<br>
>>>>>>>>>>> this<br>
>>>>>>>>>>>>> issue with -fno-unroll-loops.<br>
>>>>>>>>>>>>>> Thanks.<br>
>>>>>>>>>>>>>><br>
>>>>>>>>>>>>>><br>
>>>>>>>>>>>>>><br>
>>>>>>>>>>>>>><br>
>>>>>>>>>>>>>> _______________________________________________<br>
>>>>>>>>>>>>>> LLVM Developers mailing list<br>
>>>>>>>>>>>>>> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank" rel="noreferrer">llvm-dev@lists.llvm.org</a><br>
>>>>>>>>>>>>>> <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
>>>>>>>>>>>> _______________________________________________<br>
>>>>>>>>>>>> LLVM Developers mailing list<br>
>>>>>>>>>>>> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank" rel="noreferrer">llvm-dev@lists.llvm.org</a><br>
>>>>>>>>>>>> <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote></div></div></div>