[llvm-dev] pass invalidation

Wed Jun 22 09:47:21 PDT 2016

On 6/22/16 1:02 AM, Yuxi Chen wrote:
> Hi Prof. John Criswell,
>
> Really appreciate your detailed reply.
> Yes, I am using llvm to analyse C code for my research, I am quite new 
> for llvm and clang.

Just to check, have you read the "How to Write an LLVM Pass" document on 
the LLVM web page?

> I still have several questions.
> 1. To my understanding, if we add pass in getAnalysisUsage method(like 
> LoopInfo), every time(for function pass), when we invoke 
> runOnFunction(), llvm would automatically load result of LoopInfo, 
> right? But when runOnFunction is invoked? It's in constructor?

A pass is a C++ object.  For a FunctionPass, the pass manager will 
create the pass object once and then call its runOnFunction() method for 
every function in the program.  Additionally, if a ModulePass requires a 
FunctionPass, then the pass manager will call the runOnFunction() method 
on any function that the ModulePass requests.

If you are using opt, use the -debug-pass=Structure argument to make opt 
print out the structure of passes.  That will probably make it more 
clear as to how passes are scheduled and run.

Finally, if you're confused about when your passes are run, it might be 
better to write your passes as a ModulePass first.  You can almost never 
go wrong writing a ModulePass, and they are simpler to understand than 
FunctionPasses.

>
> 2. right now, my passes includes several transform passes and analysis 
> passes. For transform passes, they also use some built-in analysis 
> passes, like AliasAnalysis, LoopInfo.

Your transform passes can safely use any existing LLVM passes that do 
not modify the IR (such as AliasAnalysis and LoopInfo).  The only real 
restriction is that you want to avoid using getAnalysis<>() to get 
pointers/references to passes that modify the IR.

> My transform passes are to move some instructions around based on some 
> analysis passes. Then other analysis passes would use those modified 
> IR code.  Your suggestion is to dump information needed by my analysis 
> passes into a new RK pass. I am not clear about it. Do you mean dump 
> the modified IR code? Then pass those modified IR into my analysis 
> pass? If so, if my transform pass analyses IR based on basicblock, 
> after analysing every basicblock, I need dump something? Seem I 
> misunderstood.

What, specifically, do your analysis passes need to know?  Do they need 
to know which IR is the modified IR and which was left unmodified, or 
does it need to know something else?  Can it infer everything it needs 
to know just by looking at the IR?

If your analysis passes can determine everything they need to know by 
looking at the Module or Function passed into their 
runOnModule()/runOnFunction() methods, then you have no problem (and, in 
fact, you don't need your transform passes to communicate any additional 
information to your analysis passes).

However, if you need your transform pass to communicate information to 
your analysis passes, then you need to do something more sophisticated.  
Copying LLVM IR would be a bad idea (too much memory consumption); you 
would probably record pointers to the relevant IR objects instead.

Perhaps an example will be helpful.

Let's say that you write a pass (call it Pass A) that creates a clone of 
every function in a program.  You have an analysis pass (call it Pass B) 
that takes each function and finds the clone that Pass A created.  Pass 
A could implement a data structure that maps original functions to the 
clones it created and then provide a method to Pass B that would query 
this information.  However, that would require Pass B to use 
addRequired<>() and getAnalysis<>() to get a pointer to Pass A.  That 
could create a scheduling conflict that the PassManager cannot handle 
(e.g., Pass A invalidates another pass that Pass B requires).

Instead, you create a Pass C that contains an empty map of functions to 
their clones.  The runOnModule() method of Pass C does nothing. Pass C 
provides a method to record a new function->clone in its internal map, 
and it provides another method that takes a function and returns a 
pointer to its clone.  Pass A and Pass B both require Pass C as a 
dependency in their getAnalysisUsage<>() methods. Pass A tells Pass C 
about every clone it creates; Pass B queries Pass C any time it wants to 
lookup the clone of a function. Additionally, Pass A states that it 
preserves Pass C in its getAnalysisUsage<>() method.

In this example, Pass C is simply a pass through which Pass A and B 
communicate without creating a scheduling conflict for the pass 
manager.  It is needed because Pass B needs information which is readily 
available in Pass A that cannot be easily inferred from the LLVM Module 
that Pass B analyzes.

There are, of course, alternatives to this approach.  Pass A could put 
metadata on the clones it creates that indicate that they are clones of 
other functions; Pass B then looks for this metadata.  As long as other 
transforms don't remove the metadata, this works.

Regards,

John Criswell

-- 
John Criswell
Assistant Professor
Department of Computer Science, University of Rochester
http://www.cs.rochester.edu/u/criswell

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160622/ddf3f777/attachment.html>