[cfe-dev] cfe-dev Digest, Vol 47, Issue 60
Peter Lawrence
peterl95124 at sbcglobal.net
Fri May 20 09:57:39 PDT 2011
Ted,
very interesting...
when I worked at Sun Microsystems, I invented a generalization to the
traditional Iterative-Dataflow-Analysis
algorithm that might provide a useful way to select within the memory-
verses-accuracy tradeoff space.
-Peter Lawrence.
On May 19, 2011, at 12:28 PM, cfe-dev-request at cs.uiuc.edu wrote:
>
> Hi Lei,
>
> Running out of memory is a real problem that needs a general
> solution. I have been mulling this over myself, and have a few
> ideas that match with #2 and #3.
>
> Before I get into those, I think it is worth pondering the fact
> that unless we force the path exploration to combine paths, there
> will always be cases where we don't exhaustively explore all
> paths. For example, consider the kind of dataflow analysis done in
> many compiler analyses: paths are always merged at confluence
> points, guaranteeing convergence and bounding memory usage. The
> path-sensitive analysis doesn't force this kind of convergence
> (although paths converge when the states are the same), so if you
> are interested in forcing convergence I think we'll need something
> else in the analyzer that isn't there right now.
>
> Let's talk about #2. Essentially, the wordlist algorithm (which
> currently is either BFS or DFS) is all about prioritizing what part
> of the state space gets explored first. My goal is that we
> eventually have other intelligent algorithms besides BFS and DFS,
> that allow us to exploit even more information. Thus, switching
> between BFS and DFS to solve this problem isn't the answer.
> Switching between those should help if, say, you want to bound the
> amount of work done by the analyzer (which we do) and want to
> prioritize that some types of paths are prioritized over others.
> This may help you find different kinds of bugs more effectively.
> For "all paths" checkers, like yours, BFS will tend to cover paths
> that end sooner, but it won't explore paths that require going
> deeper into the state space (like DFS does). This will directly
> impact the results you get from your checker, but those are just
> tradeoffs in the results. It doesn't actually solve the
> scalability problem. If th!
> e number of paths to explore are too costly to enumerate, we are
> going to run into memory problems regardless. Put another way, DFS
> and BFS are just enumerating paths in a different order. The cost
> to enumerate all of them is the same.
>
> To tackle memory scalability, the direction I have been gradually
> pushing the analyzer is to have the ability to gradually reclaim
> ExplodedNodes (and states) as the graph grows larger. It use to
> be that we would just generate GRStates, many which would just get
> thrown away and not included in ExplodedNodes, and not reclaim that
> memory until we reclaimed all the memory for an analysis. Now
> GRStates are reference counted, so we can reclaim some of the
> memory more eagerly. To take this further, we can possibly prune
> out parts of the ExplodedGraph as it grows if we can deem those
> parts "uninteresting." I think this is very doable, but it's a lot
> trickier than it sounds, and it requires consulting which
> ExplodedNodes are currently referenced by BugReports. If we can
> identify paths that are essentially uninteresting, nor are deemed
> critical by a checker, then we can potentially throw away those
> nodes. Note that when we do this, we are throwing away analysis
> state. If we th!
> row away a path, only to explore another path that would have
> merged with the discarded path, we are going to end up analyzing
> the same path again. That's an inherit time-space tradeoff.
>
> So, I think the solution to your problem requires a combination of
> several approaches:
>
> a) Make path analysis more scalable in general by more eagerly
> reclaiming memory when the ExplodedGraph starts to get large. This
> is essentially a "garbage collection" of uninteresting paths.
>
> b) Because (a) doesn't completely "solve" the path explosion
> problem (i.e., it doesn't guarantee that all paths are explored),
> if exploring all paths matters to you we will need to devise a
> strategy where we can (optionally) merge paths. This requires
> being able to define a merge operation for everything in GRState,
> including checker-specific state. It will then be up to some
> policy to decide when the merge takes place, as merging too
> frequently will cause us to discard too much information (i.e., it
> basically reduces to a flow-sensitive analysis).
>
> Ted
> -------------- next part --------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20110520/a752195d/attachment.html>
More information about the cfe-dev
mailing list