[LLVMdev] Updated GSoC Proposal

Reid Spencer rspencer at reidspencer.com
Sun Mar 25 09:56:14 PDT 2007


Hi Tilmann,

This looks good. I have just a few comments below but they are minor.

On Sun, 2007-03-25 at 15:51 +0200, Tilmann Scheller wrote:
> Thank you very much for the feedback, I tried to address the brought up 
> issues in this updated proposal. In case you have any suggestions or 
> comments feel free to tell me.
> 
> Thanks in Advance
> 
> Tilmann
> 
> 
> * Proposal for Google Summer of Code Project
> 
> ** Using LLVM as a backend for QEMU's dynamic binary translation
> 
> *** Terms:
> - host   architecture: the architecture of the CPU QEMU is running on
> - target architecture: the architecture of the program which is being 
> executed within QEMU
> 
> 
> *** Abstract:
> The goal of this project is to modify the QEMU dynamic binary translator 
> to use components of the LLVM compiler infrastructure to turn it into a 
> highly optimizing dynamic binary translator in order to increase the 
> performance of QEMU even further. Instead of directly emitting code for 
> the host architecture QEMU is running on, the target code is first 
> translated to LLVM IR, then a selection of LLVM's optimization functions 
> is applied to the IR and as a last step the LLVM JIT is used to generate 
is -> are
also, I'd drop "as a last step"
> code from the optimized IR for the host architecture. Since the 
> translation to LLVM IR, the optimization and the code generation comes 
> at a cost of an increased execution time, it's not feasible to apply 
> this process to any piece of code, else the execution time would be even 
> lower. Especially since on average a program spends 90% of its time 
> within 10% of the code it is critical to get these 10% to execute fast, 
> for the other 90% of the code parts might only execute once or only a 
> few times and the extra time spent to generate the optimized code would 
> not pay off. Therefore the idea is to identify the "hotspots" by 
> counting how many times a piece of code has been executed, e.g. on basic 
> block level, and performing an optimizing translation once a certain 

Its probably easier to do at the function level, at least initially.

> threshold is hit or falling back to the current binary translation of 
> QEMU if not.

This is better, but you might want to indicate your ideas for
identifying those hotspots if you can do it in a few words. Otherwise,
you can, perhaps, put those details in a web pages referenced from this
proposal (which Google encourages).

> Detailed speed measurements will be performed in order to evaluate the 
> efficiency of this approach, especially in comparison to the approach 
> currently used by QEMU.
> 
> 
> *** Benefits:
> QEMU will largely benefit from this project through an expected increase 
> in speed, while remaining portable.
> Through this project LLVM will effectively get front ends for all target 
> architectures supported by QEMU (at the moment these are x86, ARM, 
> SPARC, PowerPC and MIPS). This lays the ground for the application of 
> LLVM on binary code which could be e.g. the optimization of binaries 
> where no source code is available, the instrumentation of binary code 
> (e.g. for performance analysis), program analysis of binary code to 
> assist in reverse engineering or static recompilation (depending on the 
> instruction set this requires additional runtime code).
> This project is a first step to enhance LLVM to be suitable for static 
> or dynamic binary translation and thereby attracting new users for LLVM 
> which are interested in this subject.
> It will show the applicability of LLVM in an emulation environment, 
> especially in regard to dynamic binary translation. It can also be used 
> as a basis to try out concepts like profile-guided optimization or 
> static optimization in the context of an emulator.

Much better :)

> Also since the LLVM JIT will be used for the final code generation QEMU 
> can be hosted on any architecture targeted by the LLVM JIT (at the 
> moment this are x86, x86-64, PowerPC and PowerPC 64), at least 
> concerning code generation. Further adjustments to QEMU might be 
> necessary though to get QEMU to run on a certain architecture which is 
> supported by the LLVM JIT but not by QEMU.
> 
> 
> *** Deliverables:
> - a version of QEMU with an optimizing dynamic binary translator 
> utilizing LLVM components
> - a set of test suites which are created during the development (with at 
> least 80% statement coverage)
> - all necessary documentation to understand and be able to maintain the 
> software
> 
> 
> *** Plan:
> The development of the software will be done within the three month 
> timeframe of GSoC. Weekly status reports will be given.
> 
> Week 1:
>       - get familiar with LLVM and QEMU
>       - write small test programs for certain LLVM components, or even a 
> simple prototype
>       - get to know LLVM example programs
> Week 2, 3, 4:
>       - modify QEMU's dynamic binary translator to emit LLVM IR
>       - create tests to verify the translation
> Week 5, 6:
>       - integrate LLVM JIT into QEMU's dynamic binary translator
>       - perform first speed measurements
> Week 7, 8:
>       - integrate LLVM optimizations into QEMU
>       - perform more speed measurements, select useful optimizations
> Week 9, 10:
>       - test the system extensively
>       - write final documentation
> Week 11, 12:
>       - time buffer to deal with unexpected events

Looks good.

> 
> *** Qualification:
> I'm a graduate student studying Software Engineering at the University 
> of Stuttgart in Germany. I have a strong interest in compiler technology 
> and see this project as a great opportunity to gain experience in this 
> field. I have taken a compiler building class and plan to focus my 
> future studies in this area.
> Emulation is another area i'm interested in. I wrote a Game Boy Advance 
> emulator in C from scratch and a GP32 emulator based on QEMU (also C). 
> While doing this I gained a basic understanding of the QEMU codebase.
> I'm currently involved in a university project which develops a testing 
> tool for glass box tests for Java and COBOL, which allows to gather 
> certain coverage metrics, and which will be opensourced later this year.
> I have decent experience with C and Java and i'm familiar with C++. Also 
> I have a deep understanding of the ARM architecture and I'm familiar 
> with x86.
> This project is a big chance for me to give something back to the open 
> source community, especially since both LLVM and QEMU can profit from 
> this project.

Nice job. A few minor tweaks and "ship" it :)

Reid.

> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list