[PATCH] Fix aliasing issue caused by MemCpyOptimizer (PR18304)
David Wiberg
dwiberg at gmail.com
Sun Jul 13 10:56:19 PDT 2014
Ping
2014-07-07 18:01 GMT+02:00 David Wiberg <dwiberg at gmail.com>:
> 2014-07-06 5:07 GMT+02:00 Nick Lewycky <nlewycky at google.com>:
>> On 29 June 2014 11:08, David Wiberg <dwiberg at gmail.com> wrote:
>>>
>>> Ping
>>
>>
>> + // Check that src isn't captured by the called function since the
>> + // transformation can cause aliasing issues in that case.
>> + for (unsigned i = 0; i < CS.arg_size(); ++i)
>>
>> LLVM style is to write "for (unsigned i = 0, e = CS.arg_size(); i != e;
>> ++i)". See
>> http://llvm.org/docs/CodingStandards.html#don-t-evaluate-end-every-time-through-a-loop
>>
>> + if (CS.getArgument(i) == cpySrc && !CS.doesNotCapture(i))
>> + return false;
>>
>> This looks great. I double-checked the changes to the tests to make sure
>> that you weren't simply applying nocapture on things that were actually
>> capturing, and it looks good to me.
>>
>> Do you have commit access?
>
> Thanks for the review and the information, I have updated the patch accordingly.
>
> I do not have commit access.
>
> - David
>
>>
>>> 2014-06-23 23:22 GMT+02:00 David Wiberg <dwiberg at gmail.com>:
>>> > 2014-06-23 22:33 GMT+02:00 Nick Lewycky <nicholas at mxc.ca>:
>>> >> David Wiberg wrote:
>>> >>>
>>> >>> 2014-06-21 1:29 GMT+02:00 Philip Reames<listmail at philipreames.com>:
>>> >>>>
>>> >>>> Your change does address a correctness issue and does so in a
>>> >>>> reasonable
>>> >>>> way. I wouldn't worry too much about the possible performance
>>> >>>> regression.
>>> >>>>
>>> >>>> One mandatory change:
>>> >>>> - Add a test case which checks the optimization does not apply when
>>> >>>> accessed
>>> >>>> via capturing parameter.
>>> >>>>
>>> >>>> I'd also prefer that you:
>>> >>>> - either use std::distance (rather than iterator subtraction) or
>>> >>>> (preferred)
>>> >>>> use a argument number indexed loop.
>>> >>>> - If you keep the iterator loop, consider a range loop. Or at least
>>> >>>> move
>>> >>>> the declaration of B, E into the initializer of the loop to restrict
>>> >>>> their
>>> >>>> scope.
>>> >>>>
>>> >>>> Philip
>>> >>>>
>>> >>>
>>> >>> Thanks for the review. I have added a test case and changed to use an
>>> >>> argument number indexed loop (hopefully) as per your suggestion, see
>>> >>> attached patch.
>>> >>
>>> >>
>>> >> +define void @test() {
>>> >> + %ptr1 = alloca i8
>>> >> + %ptr2 = alloca i8
>>> >> + call void @foo(i8* %ptr2)
>>> >> + call void @llvm.memcpy.p0i8.p0i8.i32(i8* %ptr1, i8* %ptr2, i32 1,
>>> >> i32 1,
>>> >> i1 false)
>>> >> + ret void
>>> >> +
>>> >> + ; Check that the transformation isn't applied if the called function
>>> >> can
>>> >> + ; capture the pointer argument (i.e. the nocapture attribute isn't
>>> >> present)
>>> >> + ; CHECK-LABEL: @test(
>>> >> + ; CHECK: call void @foo(i8* %ptr2)
>>> >> + ; CHECK-NEXT: call void @llvm.memcpy
>>> >> +}
>>> >>
>>> >> Isn't the transformation legal on this function? If someone wanted to
>>> >> teach
>>> >> memcpyopt to be really smart, it could look at %ptr1 and see that it's
>>> >> entirely dead except for the use in the memcpy, and both %ptr1 and
>>> >> %ptr2 are
>>> >> locally allocated "alloca"s.
>>> >>
>>> >> This testcase could still be added with a "FIXME: missed optz'n" note,
>>> >> but
>>> >> I'm not sure that memcpyopt should even bother with this, it seems like
>>> >> SROA
>>> >> should get it.
>>> >>
>>> >
>>> > Hi Nick,
>>> >
>>> > Thanks for the explanation. I think that you are correct. I was
>>> > looking too much at what the transformation does today. In the
>>> > attached patch the test case is slightly modified to make sure the
>>> > transformation isn't legal.
>>> >
>>> > Since you seemed uncertain about the optimization note I didn't keep
>>> > the previous test.
>>> >
>>> > If approved I need someone to commit it for me.
>>> >
>>> > Best regards
>>> > David
>>> >
>>> >>
>>> >>>> On 06/20/2014 06:13 AM, David Wiberg wrote:
>>> >>>>
>>> >>>> Hello,
>>> >>>>
>>> >>>> The attached patch fixes an aliasing issue caused by MemCpyOptimizer
>>> >>>> by performing the following changes:
>>> >>>> - Added requirement that the callee does not capture the src pointer
>>> >>>> for
>>> >>>> the transformation to apply.
>>> >>>> - Updated test cases by adding nocapture attributes where applicable
>>> >>>>
>>> >>>> See http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-June/073819.html
>>> >>>> for some more information. My main concern is performance regressions
>>> >>>> but I haven't been able to get any benchmark results.
>>> >>>>
>>> >>>> Best regards
>>> >>>> David
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> _______________________________________________
>>> >>>> llvm-commits mailing list
>>> >>>> llvm-commits at cs.uiuc.edu
>>> >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> _______________________________________________
>>> >>>> llvm-commits mailing list
>>> >>>> llvm-commits at cs.uiuc.edu
>>> >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>> >>
>>> >>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list