[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...

Wed Jul 13 00:25:52 PDT 2016

On Tue, Jul 12, 2016 at 11:39 PM, Chandler Carruth <chandlerc at gmail.com>
wrote:

> On Tue, Jul 12, 2016 at 11:34 PM Sean Silva <chisophugis at gmail.com> wrote:
>
>> On Tue, Jul 12, 2016 at 11:32 PM, Xinliang David Li <davidxl at google.com>
>> wrote:
>>
>>>
>>>
>>> On Tue, Jul 12, 2016 at 10:57 PM, Chandler Carruth <chandlerc at gmail.com>
>>> wrote:
>>>
>>>> Yea, this is a nasty problem.
>>>>
>>>> One important thing to understand is that this is specific to analyses
>>>> which hold references to other analyses. While this isn't unheard of, it
>>>> isn't as common as it could be. Still, definitely something we need to
>>>> address.
>>>>
>>>
>>> We can call this type of dependencies (holding references)
>>> hard-dependency. The soft dependency refers to the case where analysis 'A'
>>> depends on 'B' during computation, but does not need 'B' once it is
>>> computed.
>>>
>>> There are actually quite a few examples of hard-dependency case. For
>>> instance LoopAccessInfo, LazyValueInfo etc which hold references to other
>>> analyses.
>>>
>>> Problem involving hard-dependency is actually easier to detect, as it is
>>> usually a compile time problem. Issues involving soft dependencies are more
>>> subtle and can lead to wrong code gen.
>>>
>>
>> Did you mean to say that soft-dependency problems are easier to detect?
>> At least my intuition is that soft-dependency is easier because there is no
>> risk of dangling pointers to other analyses.
>>
>
> The issue is that the fact that there is *any* dependency isn't clear.
>
> However, I think the only real problem here are these "hard dependencies"
> (I don't really like that term though). For others, only an analysis that
> is *explicitly* preserved survives. So I'm not worried about the fact that
> people have to remember this.
>
> The question is how often there are cross-data-structure references. David
> mentions a few examples, and I'm sure there are more, but it isn't clear to
> me yet whether this is pervasive or occasional.
>

I just did a quick run-through of PassRegistry.def and this is what I found:

Module analyses: 0/5 hold pointers to other analyses
CallGraph: No pointers to other analyses.
LazyCallGraph: No pointers to other analyses.
ProfileSummaryAnalysis: No pointers to other analyses.
TargetLibraryAnalysis: No pointers to other analyses.
VerifierAnalysis: No pointers to other analyses.

Module alias analyses: 1/1 keeps pointer to other analysis.
GlobalsAA: Result keeps pointer to TLI (this is a function analysis).

Function analyses: 9/17 keep pointers to other analysis
AAManager: Its Result holds TLI pointer and pointers to individual AA
result objects.
AssumptionAnalysis: No pointers to other analyses.
BlockFrequencyAnalysis: Its Result holds pointers to LoopInfo and BPI.
BranchProbabilityAnalysis: Stores no pointers to other analyses. (uses
LoopInfo to "recalculate" though)
DominatorTreeAnalysis: Stores no pointers to other analyses.
PostDominatorTreeAnalysis: Stores no pointers to other analyses.
DemandedBitsAnalysis: Stores pointers to AssumptionCache and DominatorTree
DominanceFrontierAnalysis: Stores no pointers to other analyses.
(uses DominatorTreeAnalysis for "recalculate" though).
LoopInfo: Uses DominatorTreeAnalysis for "recalculate" but stores no
pointers.
LazyValueAnalysis: Stores pointers to AssumptionCache, TargetLibraryInfo,
DominatorTree.
DependenceAnalysis: Stores pointers to AliasAnalysis, ScalarEvolution,
LoopInfo
MemoryDependenceAnalysis: Stores pointers to AliasAnalysis,
AssumptionCache, TargetLibraryInfo, DominatorTree
MemorySSAAnalysis: Stores pointers to AliasAnalysis, DominatorTree
RegionInfoAnalysis: Stores pointers to DomTree, PostDomTree, DomFrontier
ScalarEvolutionAnalysis: Stores pointers to TargetLibraryInfo,
AssumptionCache, DominatorTree, LoopInfo
TargetLibraryAnalysis: Has no dependencies
TargetIRAnalysis: Has no dependencies.

Function alias analyses: 3/5 keep pointers to other analyses
BasicAA: Keeps pointers to TargetLibraryInfo, AssumptionCache,
DominatorTree, LoopInfo
CFLAA: Keeps pointer to TargetLibraryInfo
SCEVAA: Keeps pointer to ScalarEvolution
ScopedNoAliasAA: No dependencies
TypeBasedAA: No dependencies

Total: 13/28 analyses (~50%) hold pointers to other analyses.
Of the 15/28 analyses that don't hold pointers, 12/15 simply have no
dependencies. Only 3/15 (BPI, LoopInfo, DominanceFrontier) have
dependencies that are used just for a "recalculate" step that retains no
pointers.
So I think it is fair to say that analyses which hold pointers to other
analyses is not an exceptional case. In fact, analyses that use other
analyses just for a "recalculate" step seems to be the exceptional case
(only 3/28 or about 10%)

Since I like to visualize things, here is a quick graph of the dependencies
between analyses which hold pointers to each other.
Edge A -> B indicates "A can/does hold a pointer to B".
dot file: http://reviews.llvm.org/P6603
rendering: http://reviews.llvm.org/F2161058

(I've been a bit loose with terminology here. A lot of the times that I say
"analysis" I mean "the analysis' result object" or I use the name of the
analysis interchangeably with the analysis result object)

-- Sean Silva

>
> And even then it isn't clear how onerous explicitly managing this in
> invalidate overrides will be.
>
>
>>
>> -- Sean Silva
>>
>>
>>>
>>> David
>>>
>>>
>>>
>>>>
>>>> Some ideas about mitigating and fixing it below.
>>>>
>>>> On Tue, Jul 12, 2016 at 6:15 PM Sean Silva <chisophugis at gmail.com>
>>>> wrote:
>>>>
>>>>> How should we solve this? I see two potential solutions:
>>>>> 1. Analyses must somehow list the analyses they depend on (either by
>>>>> overriding "invalidate" to make sure that they invalidate them, or
>>>>> something "declarative" that would allow the AnalysisManager to walk the
>>>>> transitive dependencies).
>>>>>
>>>>
>>>> I think this is the right approach. I would personally start by
>>>> overriding the invalidate callback everywhere that it is necessary, and see
>>>> how bad that becomes.
>>>>
>>>> If it becomes common and burdensome, then we can change the way
>>>> invalidation works such that the analysis manager is aware of the preserved
>>>> analysis set in more detail, and have it build up the necessary data
>>>> structures to know in-advance whether it must make an explicit invalidate
>>>> call.
>>>>
>>>> However, I suspect this may not be *too* bad for two reasons:
>>>>
>>>> a) As I mentioned above, I'm hoping there aren't *too* many handles
>>>> between different analyses. But I've not done a careful examination, so we
>>>> can check this.
>>>>
>>>> b) For many analyses that might trigger this, I think we have a simpler
>>>> option. If the analysis is *immutable* for any reason -- that is, it
>>>> overrides its invalidate routine to always return "false" the way
>>>> TargetLibraryInfo should (although I'm not sure it does currently), we
>>>> shouldn't need to do this as it shouldn't be getting cleared out. Does this
>>>> make sense? Do others see anything I'm missing with that approach?
>>>>
>>>> 2. The AnalysisManager must do a somewhat complicated dance to track
>>>>> when analyses call back into it in order to get other analyses.
>>>>>
>>>>
>>>> I would really rather avoid this, as currently the analysis manager's
>>>> logic here is very simple, and in many cases we only need the analyses to
>>>> *compute* our result, not to embed it. I'm tihnking of stuff like
>>>> Dominators is used to build LoopInfo, but there isn't a stale handle there.
>>>>
>>>>
>>>>
>>>> There is another aspect of course in that if something is preserving
>>>> LoopInfo, it really should be preserving Dominators too...
>>>>
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160713/a4f40c2c/attachment.html>