[LLVMdev] ScheduleDAGInstrs.cpp
Jonas Paulsson
jonas.paulsson at ericsson.com
Fri Dec 19 06:51:07 PST 2014
Hi,
I write again regarding buildSchedGraph(), as I am still not happy about things there.
I have found at least two examples which do not work out:
1)
SU(2) Store "Value A"
SU(1) Store "Value A"
SU(0) Load "Value A"
If MIsNeedChainEdge() returns false for SU(0) and SU(1), SU(0) is inserted into RejectedMemNodes and removed from its MemUses SU list, as this list is cleared. Therefore SU(2) must be handled with adjustChainDeps(), because it needs an edge from SU(0).
For some reason adjustChainDeps() was only called for may-aliasing stores. I think this is wrong, as a store will clear the MemUses SU list also in the non-aliasing case.
If MIsNeedChainEdge() returns true for SU(0) and SU(1), what happens if MIsNeedChainEdge() returns false between SU(2) and SU(1)? The dependency against SU(0) will not be checked, because it is not in RejectMemNodes, nor in the MemUses SU list.
2)
SU(2) Load "Value A"
SU(1) Store "Value A"
SU(0) Store "Value A"
If not using alias analysis, then the MemDefs list is cleared after *maybe* having inserted an edge from SU(0) to SU(1) with addChainDependency(). If there was not an edge inserted, then SU(2) must get a chance to check its deps against SU(0) by use of adjustChainDeps(). Again, this is only done if MayAlias is true.
I suspect therefore that areMemAccessesTriviallyDisjoint() is currently used in a very dangerous way, since it might get lucky in just one of two (equivalent) queries, and a dependency might be missed later on as seen above.
Or am I missing something here?
Senjin, I am working on an out-of-tree target. The problem I had was the first scenario in #1, and I fixed it by moving the adjustChainDeps() call to be called also in the non-aliasing case.
Another question I have is regarding adjustChainDeps(). If MIsNeedChainEdge() returns true, is it still then always necessary to continue recursively with successors? If this is a latency issue, perhaps it would be ok to do "continue" after adding the pred if the latency is non-zero? Right now this seems to be always zero, or one in the store-load case per the comment on line 856.
On line 947: "A store to a specific PseudoSourceValue" should probably say "... a specific Value / PseudoSourceValue", right? The same goes for line 1029.
/Jonas
More information about the llvm-dev
mailing list