[LLVMdev] Hint about how to contribute to LLVM

Tue Oct 11 19:31:38 PDT 2011

Alex Garzao wrote:
> Dear all,
>
> If possible, I would like to contribute to LLVM. First of all, I would
> like to say that I'm a newbie in LLVM, and my experience in compilers
> implementation is only in simples and academics "toy projects".
>
> My main interesting are focused in try to improve the code generated
> by LLVM, where "improve" I mean "better performance". I'm interested,
> for example, in machine-independent optimizations but, reading more
> about LLVM, it's seems to be complete.

Far from it! Take a look inside lib/Target/README.txt. It's full of 
entries like this:

> [LOOP DELETION]
>
> We don't delete this output free loop, because trip count analysis doesn't
> realize that it is finite (if it were infinite, it would be undefined).  Not
> having this blocks Loop Idiom from matching strlen and friends.
>
> void foo(char *C) {
>   int x = 0;
>   while (*C)
>     ++x,++C;
> }

A number of them are hard to approach (if they were easy, someone would 
have done them -- try reading the file from the bottom, as new entries 
get appended), but they're all good places to start poking into the 
optimizer. As a beginner, I'd avoid ones that have caveats like "but to 
do this, we need to change the codegen to legalize it back" (ie., undo 
the optimization if the target doesn't have the appropriate instruction).

Another great source of things that I find is just playing with the STL, 
writing:

   void foo() {
     vector<int> v;
   }

and verifying that this actually gets deleted. Trying different base 
types instead of vector, maybe calling methods on it (ie., add 
"v.push_back(5);" to this example, and now the code doesn't get deleted!).

Don't worry about performance impact of your changes yet. Once you're 
comfortable analyzing LLVM IR and adding optimizations (or learning when 
to move on), you'll start seeing missed optz'ns in .ll files everywhere. 
You don't necessarily want to start at a profiler to decide which 
optimizations to tackle; a memory access saved here may avoid calling a 
slow function entirely, or call it half as often, etc. I generally 
consider optimizations that apply to loads/store instructions to be the 
most important, then optimizations on loop structure, then libcalls, 
float math (hard!, but lots of low-hanging fruit), integer 
divide/remainder, and finally everything else.

Nick

> If possible, I would like some suggestions about current possibilities
> and, more that, "additional reading" and a "current status" if someone
> are currently working in this.
>
> Looking at http://llvm.org/OpenProjects.html, one point seems interesting:
>
> "Miscellaneous Improvements. Move more optimizations out of the
> -instcombine pass and into InstructionSimplify. The optimizations that
> should be moved are those that do not create new instructions, for
> example turning sub i32 %x, 0 into %x. Many passes use
> InstructionSimplify to clean up code as they go, so making it smarter
> can result in improvements all over the place.".
>
> For a newbie, what is the complexity of this task? Someone would
> suggest other tasks?