[LLVMdev] [PATCH] Call adjustChainDeps() always when handling a store.

Thu Jan 8 07:01:18 PST 2015

Hi,

here is a patch on a problem I described in my previous mail (number 1) for your review. In short, I think adjustChainDeps() must be called regardless of MayAlias while handling a store.

This problem was found on an out-of-tree target, and no test case can be provided for an official target.

Let me know if I can commit this patch, or if you have any comments.

/Jonas

PS. Sanjin, regarding TII->areMemAccessesTriviallyDisjoint(), my target does the same things as you mentioned: register / offset based analysis, both pre/post RA. It is adding very little
improvement over AA, it seems. I don't have any failing test cases right now, so it may be that this patch actually fixed my problem. However, right now it is confusing to have the whole
algorithm written around memory operands and at the same time allow register / offsets analyzis in the midst of things. It would be good to rewrite that part like you explained. What are
your current thoughts about this?

-----Original Message-----
From: Hal Finkel [mailto:hfinkel at anl.gov] 
Sent: den 23 december 2014 18:11
To: Sanjin Sijaric
Cc: Mattias Eriksson V; llvmdev at cs.uiuc.edu; Tom Stellard; Sergei Larin; Jonas Paulsson; Andrew Trick
Subject: Re: ScheduleDAGInstrs.cpp

----- Original Message -----
> From: "Sanjin Sijaric" <ssijaric at codeaurora.org>
> To: "Jonas Paulsson" <jonas.paulsson at ericsson.com>, "Andrew Trick" 
> <atrick at apple.com>
> Cc: "Hal Finkel" <hfinkel at anl.gov>, "Mattias Eriksson V" 
> <mattias.v.eriksson at ericsson.com>, llvmdev at cs.uiuc.edu, "Tom Stellard" 
> <thomas.stellard at amd.com>, "Sergei Larin" <slarin at codeaurora.org>
> Sent: Friday, December 19, 2014 1:51:44 PM
> Subject: RE: ScheduleDAGInstrs.cpp
> 
> Hi Jonas,
> 
> How is your target implementing areMemAccessesTriviallyDisjoint?  The 
> callback is there so that we don't get into the situation where the 
> call to isIdentifiedObject (which is called from isUnsafeMemoryObject 
> from MIsNeedChainEdge) results in the edge being added between two 
> memory locations that the target can easily prove are different (e.g 
> based on base register + index + offset, etc).
> 
> Are you seeing the problem during pre-RA scheduling or post-RA 
> scheduling?
> 
> I think we may be able to get rid of areMemAccessesTriviallyDisjoint, 
> and let the AA code in MIsNeedChainEdgee handle it.  This works for me 
> if I don't call isIdentifiedObject from isUnsafeMemoryObject if we are 
> using AA.  Currently, the aliasing code won't get a chance to run in 
> some cases as isUnsafeMemoryObject returns true, resulting in
> the edge being added.   Maybe someone can shed some light as to why
> the call to isIdentifedMemoryObject in isUnsafeMemoryObject is needed 
> if we are using AA.

I think that it might not be needed. Here's my thinking:

There is a call to getUnderlyingObjectsForInstr, and that needs to be there because of how the calling code groups the resulting SUs by underlying object. Specifically, ScheduleDAGInstrs::buildSchedGraph first gets the underlying objects for a value, and then only calls addChainDependency on SUs that share one of these underlying objects. The reason why we insist on having an identified underlying object is so that we know we're not being non-uniformly limited by the depth constraint. I mean the following. Assume I have two globals (@a and @b), and I have a third value formed using a combination of them: %c = select i1 %something, @a, @b -- so I have

 SU(0): underlying objects: @a
 SU(1): underlying objects: @b
 SU(2): underlying objects: @a and @b

now imagine that our getUnderlyingObjectsForInstr had a trivial depth limit, and we did not check that the underlying objects returned were identified objects, then we might have:

 SU(0): underlying objects: @a
 SU(1): underlying objects: @b
 SU(2): underlying objects: %c

and we might never give AA the chance to conclude that %c and @a or @b might alias (because addChainDependency might never be called on SU(0) <-> SU(2) because @a != %c).

To prevent this problem, we clear the underlying objects list in getUnderlyingObjectsForInstr if any of the underlying objects are not identified (so that we can fall back to the conservative behavior).

Now in isUnsafeMemoryObject we have the same check, so that when addChainDependency is called on an SU with a cleared underlying objects list (which we did specifically to get conservative behavior), we always get that conservative behavior. However, this check seems unnecessary (except, perhaps, as noted in the next paragraph). When not using AA, by the time we get to MIsNeedChainEdge, we're already destined to return true from MIsNeedChainEdge (except for load/load edges). When using AA, we might not return true (depending on what AA returns), but the AA check is specific to the two instructions being queried, and AA understands how to handle selects/phis/etc. in a reasonable way.

Nevertheless, I'm somewhat worried about a subtlety here: I believe that we cannot fail to addChainDependency on the AliasChain because AA is smarter about underlying objects because the AliasChain actually can represent many other dependencies. So maybe when we call addChainDependency with SU on the AliasChain, we need to always pass nullptr for AA.

 -Hal

> 
> Thanks,
> Sanjin
> 
> -----Original Message-----
> From: Jonas Paulsson [mailto:jonas.paulsson at ericsson.com]
> Sent: Friday, December 19, 2014 6:51 AM
> To: Andrew Trick
> Cc: Hal Finkel; Mattias Eriksson V; llvmdev at cs.uiuc.edu; Sanjin 
> Sijaric; Tom Stellard; Sergei Larin
> Subject: ScheduleDAGInstrs.cpp
> 
> Hi,
> 
> I write again regarding buildSchedGraph(), as I am still not happy 
> about things there.
> 
> I have found at least two examples which do not work out:
> 
> 1)
> 
> SU(2) Store "Value A"
> 
> SU(1) Store "Value A"
> 
> SU(0) Load  "Value A"
> 
> If MIsNeedChainEdge() returns false for SU(0) and SU(1), SU(0) is 
> inserted into RejectedMemNodes and removed from its MemUses SU list, 
> as this list is cleared. Therefore SU(2) must be handled with 
> adjustChainDeps(), because it needs an edge from SU(0).
> For some reason adjustChainDeps() was only called for may-aliasing 
> stores. I think this is wrong, as a store will clear the MemUses SU 
> list also in the non-aliasing case.
> 
> If MIsNeedChainEdge() returns true for SU(0) and SU(1), what happens 
> if MIsNeedChainEdge() returns false between SU(2) and SU(1)? The 
> dependency against SU(0) will not be checked, because it is not in 
> RejectMemNodes, nor in the MemUses SU list.
> 
> 2)
> 
> SU(2) Load  "Value A"
> 
> SU(1) Store "Value A"
> 
> SU(0) Store "Value A"
> 
> If not using alias analysis, then the MemDefs list is cleared after
> *maybe* having inserted an edge from SU(0) to SU(1) with 
> addChainDependency(). If there was not an edge inserted, then SU(2) 
> must get a chance to check its deps against SU(0) by use of 
> adjustChainDeps(). Again, this is only done if MayAlias is true.
> 
> I suspect therefore that areMemAccessesTriviallyDisjoint() is 
> currently used in a very dangerous way, since it might get lucky in 
> just one of two (equivalent) queries, and a dependency might be missed 
> later on as seen above.
> 
> Or am I missing something here?
> 
> Senjin, I am working on an out-of-tree target. The problem I had was 
> the first scenario in #1, and I fixed it by moving the
> adjustChainDeps() call to be called also in the non-aliasing case.
> 
> Another question I have is regarding adjustChainDeps(). If
> MIsNeedChainEdge() returns true, is it still then always necessary to 
> continue recursively with successors? If this is a latency issue, 
> perhaps it would be ok to do "continue" after adding the pred if the 
> latency is non-zero? Right now this seems to be always zero, or one in 
> the store-load case per the comment on line 856.
> 
> On line 947: "A store to a specific PseudoSourceValue" should probably 
> say "... a specific Value / PseudoSourceValue", right? The same goes 
> for line 1029.
> 
> /Jonas
> 
> 
> 
> 

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Call-adjustChainDeps-always-when-handling-a-store.patch
Type: application/octet-stream
Size: 1840 bytes
Desc: 0001-Call-adjustChainDeps-always-when-handling-a-store.patch
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150108/ee1e9d54/attachment.obj>