[llvm-dev] A couple ideas for possible GSoC projects

Tue Mar 22 16:54:19 PDT 2016

And I thought of another idea immediately after I sent this.. Of course.  :)

Modify Bugpoint to Reduce a *Particular* Crash
Bugpoint is our test case reduction tool.  Today, bugpoint will reduce 
*any* crash it observes.  While this is surprisingly useful as a fuzzing 
tool, it can be frustrating when trying to isolate a particular failure 
out of a large IR file.  Bugpoint can sometimes end up introducing 
invalid IR - due to bugs in bugpoint itself - which lead to 
"uninteresting" crash reductions as well.  A very useful enhancement for 
bugpoint would be to identify a "signature" of the original crash and 
only consider a reduction to succeed if the "signature" is preserved.  
(For example, the signature of an assertion failure might be the file 
and line printed to the console.)

A bonus feature would be to save each intermediate crash which produced 
an alternate signature so that they could be independently reduced at a 
later time.

Fair warning, the bugpoint codebase is much less well maintained then 
the rest of LLVM.  Anyone who takes on this project should expect to 
need to do some incremental cleanup in order to make progress.  There is 
also a lot of room for further enhancements to bugpoint if the initial 
project was finished early.

On 03/22/2016 04:43 PM, Philip Reames wrote:
> If there are any students looking for ideas, here a couple of projects 
> you might consider.
>
> p.s. Anyone know where in the repo the OpenProjects page is?  I'd 
> expected it to be the docs/ folder of the LLVM repo, but it wasn't.
>
> Transactional Memory Optimization
> Intel recently introduced transactional memory support in hardware. 
> This project would consist of implementing optimizations which are 
> only legal inside a transactional region (e.g. removing release fences 
> since all stores must appear atomically as seen by other threads.)  
> This project will involved a good amount of research into what the 
> hardware guarantees and reasoning about what optimizations that 
> allows.  This project is probably best for a student with experience 
> reasoning about concurrency at the hardware level.
>
> Thread Local Optimization
> A compare-and-swap instruction executed against a location which is 
> provably thread local can be converted into a load, test, and store.  
> Many other instructions intended for cross thread interaction have 
> cheaper variants which apply to thread local locations.  The key part 
> of this project will be introducing an analysis (probably based on 
> CaptureTracking) to prove memory locations are thread local.  This 
> project is probably best for a student with experience reasoning about 
> concurrency in software.
>
> Capture Tracking Improvements
> Our current capture tracking analysis (see CaptureTracking.cpp) is 
> relatively weak and expensive.  It could be improved in a number of ways:
> 1) Review cases where potentially captured is currently returned for 
> possible false positives.  There are a couple of known ones which 
> would make good introductory patches.
> 2) The results could be cached to amortize the cost.  This will 
> require careful design to ensure the result is invalidated when 
> appropriate.
> 3) A more precise analysis could be used to identify object sub-graphs 
> which do not escape (despite links between them).
> This would be a good project for a student interested in compiler 
> analysis with strong skills in designing appropriate data structures 
> for the task at hand.
>
> Dereferenceability Analysis
> Currently LLVM has a number of places in code which reason about the 
> dereferenceability of memory.  This project would involve 
> consolidating all of those into a common set of utilities, reviewing 
> each use for possibly improvements in compile time and precision, and 
> possibly introducing a caching layer (analysis pass) to allow the use 
> of more expensive reasoning.
>
> ValueTracking Analysis
> ValueTracking is a key analysis framework used by LLVM, but it is 
> currently structured as a set of recursive queries with limited 
> depth.  There is no caching of queries which can result in substantial 
> losses in compile time.  Introducing a caching layer (probably in the 
> form of an Analysis pass) would be a useful improvement.  However, 
> there are significant complications around invalidation of the cached 
> information (e.g. simplifyDemandedBits) which will require careful 
> thought and design.  This project would probably be best for a student 
> with some existing experience in the LLVM code base.