[llvm-commits] [PATCH] Use GetUnderlyingObjects in misched
Andrew Trick
atrick at apple.com
Fri Dec 7 16:16:50 PST 2012
On Dec 6, 2012, at 10:28 AM, Andrew Trick <atrick at apple.com> wrote:
> Hal,
>
> I'm hoping to do some compile time measurements soon. I'll apply your patch and verify that it isn't a problem for the usual benchmarks. I agree that GetUnderlyingObjects should use its MaxLookup parameter correctly. That's a more general problem that doesn't need to hold up the patch.
I don't see any alarming compile times with your patch. Actually I tested a modified version of the patch where I removed the inner loop that revisits all found objects at each step. Take a look at the modifications. If you agree, then please check in.
-Andy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: misched_objs-andy.patch
Type: application/octet-stream
Size: 12871 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20121207/e6dc93ad/attachment.obj>
-------------- next part --------------
> As a side note, I recently improved getUnderlyingObjectFromInt() to handle the case in which the address offset comes from a phi.
>
> While your waiting for my compile time results, can you address this comment:
>
> ---
> What if you mark the results of GetUnderlyingObjects visited like this?
>
> + for (SmallVector<Value *, 4>::iterator I = Objs.begin(), IE = Objs.end();
> + I != IE; ++I) {
> + V = *I;
> + if (!Visited.insert(V))
> + continue;
>
> Then do you still need this?
>
> + if (!Visited.count(O))
> + Working.push_back(O);
>
> Or this?
>
> + bool Found = false;
> + for (SmallVectorImpl<Value *>::iterator J = Objects.begin(),
> + JE = Objects.end(); J != JE; ++J)
> + if (V == *J) {
> + Found = true;
> + break;
> + }
> +
> + if (!Found)
>
> -Andy
>
> On Dec 6, 2012, at 9:40 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
>> Sergei, thanks!
>>
>> Evan, Andy, the changes on my end also look small. I'd need to setup the benchmarks on a quieter machine to get real numbers, and I've not had a chance to do that yet. Do either of you object to committing this in the mean time?
>>
>> Also, if there is an unacceptably-large performance impact, I think that the correct way to address the problem would be with a follow-up patch to provide GetUnderlyingObjects with a recursion limit (in the current implementation, it passes its MaxLookup parameter to GetUnderlyingObject, but does not use it itself to limit the number of phis and selects it looks through).
>>
>> Thanks again,
>> Hal
>>
>> ----- Original Message -----
>>> From: "Sergei Larin" <slarin at codeaurora.org>
>>> To: "Hal Finkel" <hfinkel at anl.gov>, "Evan Cheng" <evan.cheng at apple.com>
>>> Cc: llvm-commits at cs.uiuc.edu
>>> Sent: Monday, December 3, 2012 10:45:54 AM
>>> Subject: RE: [llvm-commits] [PATCH] Use GetUnderlyingObjects in misched
>>>
>>> It was not much - I did not keep the exact numbers, but it was
>>> somewhere under 1-2% for a pathological case.
>>> I think the patch itself is very important and I want to see it in
>>> the tree asap... even if it lengthens the analysis time a bit.
>>> Thanks.
>>>
>>> Sergei
>>>
>>> ---
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>>> hosted by The Linux Foundation
>>>
>>>
>>>> -----Original Message-----
>>>> From: Hal Finkel [mailto:hfinkel at anl.gov]
>>>> Sent: Friday, November 30, 2012 10:31 PM
>>>> To: Evan Cheng
>>>> Cc: Sergei Larin; llvm-commits at cs.uiuc.edu
>>>> Subject: Re: [llvm-commits] [PATCH] Use GetUnderlyingObjects in
>>>> misched
>>>>
>>>> ----- Original Message -----
>>>>> From: "Evan Cheng" <evan.cheng at apple.com>
>>>>> To: "Hal Finkel" <hfinkel at anl.gov>
>>>>> Cc: "Sergei Larin" <slarin at codeaurora.org>,
>>>>> llvm-commits at cs.uiuc.edu
>>>>> Sent: Sunday, November 25, 2012 1:50:31 PM
>>>>> Subject: Re: [llvm-commits] [PATCH] Use GetUnderlyingObjects in
>>>>> misched
>>>>>
>>>>> I'm also concerned about compile time impact. Do you have some
>>>>> numbers?
>>>>
>>>> Not anything comprehensive, but I'll run some tests. We certainly
>>>> may
>>>> want to limit the recursion depth on targets that don't see any
>>>> significant benefit.
>>>>
>>>> Sergei, can you quantify your observed compile-time effect?
>>>>
>>>> Thanks again,
>>>> Hal
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Evan
>>>>>
>>>>> On Nov 20, 2012, at 11:04 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>>>>>
>>>>>> ----- Original Message -----
>>>>>>> From: "Sergei Larin" <slarin at codeaurora.org>
>>>>>>> To: "Hal Finkel" <hfinkel at anl.gov>
>>>>>>> Cc: llvm-commits at cs.uiuc.edu, "Andrew Trick"
>>>>>>> <atrick at apple.com>
>>>>>>> Sent: Tuesday, November 20, 2012 12:25:35 PM
>>>>>>> Subject: RE: [llvm-commits] [PATCH] Use GetUnderlyingObjects
>>>>>>> in
>>>>>>> misched
>>>>>>>
>>>>>>> Hal,
>>>>>>>
>>>>>>> I tried it with Hexagon and it seems to work fine. The only
>>>>>>> (minor)
>>>>>>> concern, it seems to affect my compile time, although very
>>>>>>> slightly.
>>>>>>> Thank you for allowing me time for review.
>>>>>>
>>>>>> Not a problem, thanks for reviewing!
>>>>>>
>>>>>> -Hal
>>>>>>
>>>>>>>
>>>>>>> Sergei
>>>>>>>
>>>>>>> ---
>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora
>>>>>>> Forum,
>>>>>>> hosted by The Linux Foundation
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Hal Finkel [mailto:hfinkel at anl.gov]
>>>>>>>> Sent: Monday, November 19, 2012 2:29 PM
>>>>>>>> To: Sergei Larin
>>>>>>>> Cc: llvm-commits at cs.uiuc.edu; Andrew Trick
>>>>>>>> Subject: Re: [llvm-commits] [PATCH] Use GetUnderlyingObjects
>>>>>>>> in
>>>>>>>> misched
>>>>>>>>
>>>>>>>> ----- Original Message -----
>>>>>>>>> From: "Sergei Larin" <slarin at codeaurora.org>
>>>>>>>>> To: "Hal Finkel" <hfinkel at anl.gov>, "Andrew Trick"
>>>>>>>>> <atrick at apple.com>
>>>>>>>>> Cc: llvm-commits at cs.uiuc.edu
>>>>>>>>> Sent: Monday, November 19, 2012 2:26:01 PM
>>>>>>>>> Subject: RE: [llvm-commits] [PATCH] Use GetUnderlyingObjects
>>>>>>>>> in
>>>>>>>>> misched
>>>>>>>>>
>>>>>>>>> Hal,
>>>>>>>>>
>>>>>>>>> Please allow me some time to take a closer look at this. (A
>>>>>>>>> day
>>>>>>>>> probably).
>>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> Great, thanks!
>>>>>>>>
>>>>>>>> -Hal
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sergei
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora
>>>>>>>>> Forum, hosted by The Linux Foundation
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: llvm-commits-bounces at cs.uiuc.edu
>>>>>>>>>> [mailto:llvm-commits-
>>>>>>>>>> bounces at cs.uiuc.edu] On Behalf Of Hal Finkel
>>>>>>>>>> Sent: Sunday, November 18, 2012 11:30 PM
>>>>>>>>>> To: Andrew Trick
>>>>>>>>>> Cc: llvm-commits at cs.uiuc.edu
>>>>>>>>>> Subject: Re: [llvm-commits] [PATCH] Use
>>>>>>>>>> GetUnderlyingObjects in
>>>>>>>>>> misched
>>>>>>>>>>
>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>> From: "Andrew Trick" <atrick at apple.com>
>>>>>>>>>>> To: "Hal Finkel" <hfinkel at anl.gov>
>>>>>>>>>>> Cc: llvm-commits at cs.uiuc.edu
>>>>>>>>>>> Sent: Sunday, November 18, 2012 3:08:11 AM
>>>>>>>>>>> Subject: Re: [PATCH] Use GetUnderlyingObjects in misched
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Nov 17, 2012, at 1:59 AM, Hal Finkel <hfinkel at anl.gov>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Andy, et al.,
>>>>>>>>>>>>
>>>>>>>>>>>> misched currently uses GetUnderlyingObject in order to
>>>>>>>>>>>> break
>>>>>>>>>>>> false load/store dependencies, and the
>>>>>>>>>>>> -enable-aa-sched-mi
>>>>>>>>>>>> feature similarly relies on GetUnderlyingObject in order
>>>>>>>>>>>> to
>>>>>>>>>>>> ensure it is safe to use the aliasing analysis.
>>>>>>>>>>>> Unfortunately,
>>>>>>>>>>>> GetUnderlyingObject does not recurse through phi nodes,
>>>>>>>>>>>> and
>>>> so
>>>>>>>>>>>> (especially due to LSR) all of these mechanisms fail for
>>>>>>>>>>>> induction-variable-dependent loads and stores inside
>>>>>>>>>>>> loops.
>>>>>>>>>>>>
>>>>>>>>>>>> The attached patch replaces uses of GetUnderlyingObject
>>>>>>>>>>>> with
>>>>>>>>>>>> GetUnderlyingObjects (which will recurse through phi and
>>>>>>>>>>>> select
>>>>>>>>>>>> instructions) in misched. On in-order cores with a long
>>>>>>>>>>>> pipline depth (such as the PPC A2), this is extremely
>>>>>>>>>>>> important (the loop
>>>>>>>>>> in
>>>>>>>>>>>> the attached test case, for example, shows a ~100%
>>>>>>>>>>>> speedup).
>>>>>>>>>>>> Please review.
>>>>>>>>>>>
>>>>>>>>>>> Thanks Hal. This does look like it fixes a serious
>>>>>>>>>>> handicap.
>>>>>>>>>>> A
>>>>>>>> few
>>>>>>>>>>> comments...
>>>>>>>>>>>
>>>>>>>>>>> +static void getUnderlyingObjects(const Value *V,
>>>>>>>>>>> + SmallVectorImpl<Value *>
>>>>>>>>>>> &Objects)
>>>>>>>>>>> {
>>>>>>>>>>> + SmallVector<const Value *, 4> Working(1, V);
>>>>>>>>>>>
>>>>>>>>>>> You also need a visited set (SmallPtrSet) for two reasons:
>>>>>>>>>>> to
>>>>>>>>>>> terminate phi-ptrint-intptr-phi cycles, and to avoid
>>>>>>>>>>> exponential path exploration in long if-then-else chains.
>>>>>>>>>>
>>>>>>>>>> Good point, thanks!
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> + GetUnderlyingObjects(const_cast<Value *>(V), Objs);
>>>>>>>>>>> +
>>>>>>>>>>> + for (SmallVector<Value *, 4>::iterator I =
>>>>>>>>>>> Objs.begin(),
>>>>>>>>>>> IE
>>>>>>>>>>> =
>>>>>>>>>>> Objs.end();
>>>>>>>>>>> + I != IE; ++I) {
>>>>>>>>>>> + V = *I;
>>>>>>>>>>>
>>>>>>>>>>> You can check if V is visited here before doing anything
>>>>>>>>>>> else.
>>>>>>>>>>
>>>>>>>>>> The result of getUnderlyingObjectFromInt should be checked
>>>>>>>>>> (maybe that's what you meant).
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> + bool Found = false;
>>>>>>>>>>> + for (SmallVectorImpl<Value *>::iterator J =
>>>>>>>>>>> Objects.begin(),
>>>>>>>>>>> + JE = Objects.end(); J != JE; ++J)
>>>>>>>>>>> + if (V == *J) {
>>>>>>>>>>> + Found = true;
>>>>>>>>>>> + break;
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + if (!Found)
>>>>>>>>>>>
>>>>>>>>>>> Since only visited values are pushed, you should not need
>>>>>>>>>>> this
>>>>>>>>>>> loop.
>>>>>>>>>>
>>>>>>>>>> Collecting the underlying objects from different IntToPtr
>>>>>>>>>> instructions could still yield duplicate underlying
>>>>>>>>>> objects. On
>>>>>>>>>> the other hand, is it important to insure that the values
>>>>>>>>>> in
>>>> the
>>>>>>>>>> array are unique?
>>>>>>>>>> (does
>>>>>>>>>> the scheduler care about the insertion of duplicate edges?)
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> } else if (MI->mayStore()) {
>>>>>>>>>>> - bool MayAlias = true;
>>>>>>>>>>> - if (const Value *V =
>>>>>>>>>>> getUnderlyingObjectForInstr(MI,
>>>>>>>>>>> MFI,
>>>>>>>>>>> MayAlias)) {
>>>>>>>>>>> + SmallVector<std::pair<const Value *, bool>, 4>
>>>>>>>>>>> Objs;
>>>>>>>>>>> + getUnderlyingObjectsForInstr(MI, MFI, Objs);
>>>>>>>>>>> +
>>>>>>>>>>> + if (Objs.empty()) {
>>>>>>>>>>> + // Treat all other stores conservatively.
>>>>>>>>>>> + goto new_alias_chain;
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + for (SmallVector<std::pair<const Value *, bool>,
>>>>>>>>>>> 4>::iterator
>>>>>>>>>>> + K = Objs.begin(), KE = Objs.end(); K != KE;
>>>>>>>>>>> ++K)
>>>>>>>>>>> {
>>>>>>>>>>>
>>>>>>>>>>> For stores, we only need to add edges to pending loads,
>>>>>>>>>>> the
>>>>>>>>>>> AliasChain, and the BarrierChain once, not for every
>>>>>>>>>>> identifiable object.
>>>>>>>>>>
>>>>>>>>>> Makes sense.
>>>>>>>>>>
>>>>>>>>>> I've attached an updated version.
>>>>>>>>>>
>>>>>>>>>> Thanks again,
>>>>>>>>>> Hal
>>>>>>>>>>
>>>>>>>>>>> It looks like you do the right thing for loads already.
>>>>>>>>>>>
>>>>>>>>>>> -Andy
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Hal Finkel
>>>>>>>>>> Postdoctoral Appointee
>>>>>>>>>> Leadership Computing Facility
>>>>>>>>>> Argonne National Laboratory
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Hal Finkel
>>>>>>>> Postdoctoral Appointee
>>>>>>>> Leadership Computing Facility
>>>>>>>> Argonne National Laboratory
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Hal Finkel
>>>>>> Postdoctoral Appointee
>>>>>> Leadership Computing Facility
>>>>>> Argonne National Laboratory
>>>>>> _______________________________________________
>>>>>> llvm-commits mailing list
>>>>>> llvm-commits at cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>
>>>>>
>>>>
>>>> --
>>>> Hal Finkel
>>>> Postdoctoral Appointee
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>>
>>>
>>
>> --
>> Hal Finkel
>> Postdoctoral Appointee
>> Leadership Computing Facility
>> Argonne National Laboratory
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list