[cfe-dev] Dataflow analysis with LLVM/Clang
Ted Kremenek
kremenek at apple.com
Wed Oct 1 13:19:20 PDT 2008
On Oct 1, 2008, at 11:47 AM, Mike Stump wrote:
> On Oct 1, 2008, at 6:52 AM, João Paulo Rechi Vita wrote:
>> I'm working on a MSc project and I need detect conflictive actions
>> between different threads in a program, through statical analysis.
>
> My take, if you want to use clang's Analysis engine (include/
> Analysis), you can't avoid using clang. It isn't clear to me from
> your description if you need to use it however. If all you want to do
> is insert code, and some some trivial analysis and llvm bitcode
> contains everything you need to do your work, then, you'd probably
> want to just stick with llvm.
>
> If you gave an example of the most complex reasoning you want to
> perform, that might help us tell you what part of llvm/clang can help
> the most.
I think Mike's comments are pretty much spot on. To me this really
amounts to listing out your requirements and what you are trying to
accomplish. From a high-level, it sounds like what you want to do is
program transformation. If the goal of the transformation is to
change runtime behavior, then you can perform the transformation at
either the LLVM IR level or by rewriting source code using Clang. If
the goal is to modify the original source code so that users now are
working with an instrumented source file, then obviously this has to
be done using Clang.
I'm going to assume that your goal is simply to modify runtime
behavior. If that is the case, my gut feeling is that it is better to
do it at the LLVM IR level if you really don't require any specific
knowledge about C. The lowered representation of the LLVM IR
marginalizes out details of the high-level language that may be really
superfluous for your task; C is a "rich" language with many
constructs, so your analysis would have to reason about many edge
cases. There are many other tradeoffs that we can go into if you are
interested.
I think the other thing to keep in mind is how the concurrency
primitives whose uses you are interested in monitoring are represented
both in Clang's AST and the LLVM IR. If you can easily identify when
such primitives are used at the LLVM IR level, then doing your
transformations there makes the most amount of sense to me (given the
information I know about what you are trying to do).
I'm not 100% certain how you wanted to use line information.
Certainly Clang has rich information about the locations of
expressions within a source file, but LLVM IR can capture some
debugging information that may be useful for constructing the line
information you need (others can chime in here, since I'm not an
expert on this topic). It all depends on what you are trying to do.
More information about the cfe-dev
mailing list