[llvm-dev] How to contribute on LLVM project as beginner

Chris Ye via llvm-dev llvm-dev at lists.llvm.org
Tue Jul 23 01:09:22 PDT 2019


Hi Paul,
Thanks for your useful guidelines, may I confirm with you the steps list below is correct or not?


1. find sample code (.c)

2. using different options(pass) to compile sample code by clang with/without "-g"
3. objdump the output.o and outout-g.o
4. compare two file of text section check if there has any difference.

5. if find difference, great, file bug and fix it.


Please correct me if I miss something.



Follow the steps,

* I used sample code (foo.c)

------------------------------------------------------------------------------------------
int foo() { return 42; }
int bar() { return foo(); }
------------------------------------------------------------------------------------------


* created a compare tool (compare.sh)
------------------------------------------------------------------------------------------
#!/bin/bash

options=$1
file=$2
 
clang -c -ffunction-sections -fexceptions -mllvm -opt-bisect-limit=200 $1 $file -o output.o
clang -c -ffunction-sections -fexceptions -mllvm -opt-bisect-limit=200 $1 $file -o output-g.o
 
objdump -d output.o > output.objdump
objdump -d output-g.o > output-g.objdump
 
diff -uNar output.objdump output-g.objdump
------------------------------------------------------------------------------------------



* Then run the compassion tests

------------------------------------------------------------------------------------------
$ ./compare.sh -O0 foo.c
$ ./compare.sh -O1 foo.c
$ ./compare.sh -O2 foo.c
$ ./compare.sh -O3 foo.c
------------------------------------------------------------------------------------------


The diff result is the same. How can I find the bug? Do the sample code I used too simple? Or need I add other more pass options? Please help to correct my steps if I missed something. Thanks you very much.



Best Regards,
Chris Ye



At 2019-07-17 01:40:43, paul.robinson at sony.com wrote:


Hi Chris,

 

"Debug info should have no effect on codegen" would be a fine project for you; nobody is working on it that I know of.  Another way to contribute would be to go to our Bugzilla (bugs.llvm.org) and search for open bugs with the "beginner" keyword.

 

Regarding the "debug info has no effect on codegen" project, unfortunately I am having IT issues that keep me from providing much in the way of specific suggestions, so what follows is fairly generic.

In principle, you compile some piece of code with and without –g, and see if there is any difference in the generated instructions. My experience is that you want to compile to a .o file, and then use a disassembler to dump the text sections. This will give you a cleaner diff than using –S to generate assembler files.

I also recommend compiling with `-ffunction-sections` and probably `-fexceptions`.  The former will put each compiled function into its own object-file section, so that differences in one function won't affect the disassembly of a later function.  The latter option should work around one fairly intractable known difference: -g will cause the compiler to emit directives to produce call-frame information, and these tend to act as instruction-scheduling barriers. Using –fexceptions (I am 95% sure that is the correct option) should cause the non-dash-g compilation to use the same directives, and avoid that known difference.

You can repeat this experiment with different optimization levels, as differences are far more likely to show up with optimization.

 

Once you find a difference, you can begin experimenting with ways to identify specific compiler passes that are contributing to the difference. A very useful tool here is the backend option `-opt-bisect-limit=N` where N is the number of passes to execute. Because it is a backend option, you would use it this way:

    clang –c –O2 –mllvm –opt-bisect-limit=100 foo.c –o foo.o

    clang –c –O2 –mllvm –opt-bisect-limit=100 foo.c –g –o foo-g.o

Then disassemble and diff as usual.  After you have identified a problematic pass, you can try your hand at fixing it yourself, or you can file a bug (with a reduced reproducer if at all possible) and move on to another sample.

 

Of course you will need some sample source code to run experiments on.  This can be anything convenient. You could try it on any personal projects you have, or you could find a random code generator, or whatever you like.  Some people have recommended LLVM's own 'test-suite' project although I have not looked at it in any detail.

 

Good luck, and feel free to post additional questions on llvm-dev if you run into any problems.

--paulr

 

From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Chris Ye via llvm-dev
Sent: Sunday, July 14, 2019 11:59 PM
To: llvm-dev at lists.llvm.org
Subject: [llvm-dev] How to contribute on LLVM project as beginner

 

Hi LLVM project Leaders,

I am a software engineer working on several other open source projects, recently I am very interested in LLVM technology, espically on backend part. I have taken two months studying the documents from llvm.org in my spare time. As a beginner, I would like to contribute some code to LLVM project, from the "Google Summer of Code 2019", I found one project "Debug Info should have no effect on codegen" that I may able to contribute, not sure if the project has already been completed? If there are still tasks exist, how can I join in? Or is that any other project I can work on? I would spend 10~20 hours on LLVM development every week as I want to gather experience to find a job as LLVM developer in the furture.  I am a quickly learning, I would be very appricate if you could help me and give me some guides, so that I would run faster on my way to LLVM field. Many thanks. 

 

-----------------------------------------------------------------

LLVM

Debug Info should have no effect on codegen

Description of the project: Adding Debug Info (compiling with `clang -g`) shouldn't change the generated code at all. Unfortunately we have bugs. These are usually not too hard to fix and a good way to discover new part of the codebase! We suggest building object files both ways and disassembling the text sections, which will give cleaner diffs than comparing .s files.

Expected results: Reduced test cases, bug reports with analysis (e.g., which pass is responsible), possibly patches.

Confirmed Mentor: Paul Robinson

Desirable skills: Intermediate knowledge of C++, some familiarity with x86 or ARM instruction set.

 

Best Regards,

Chris Ye

 

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190723/9eaaa421/attachment.html>


More information about the llvm-dev mailing list