[LLVMdev] Race Condition discovery algorithm using LLVM?

Tue Jan 11 14:33:09 PST 2005

Hello,

	I'm new to LLVM and have spent the last few weeks reading up on the
documents & code.  I would like to run a project idea by the senior members
before I start digging and find out I'm over my head. ;)

	After reading the Data Structure Analysis paper, it occurred to me
that a relatively minor modification to the pass should allow for Race
Condition discovery.  By this I mean generate warnings when data structures
are accessed in a potentially thread un-safe manner such as reading/writing
by two threads without synchronization.

	The algorithm would work something like this: Data is input
alongside the Data Structure Analysis pass (perhaps using a separate input
file if necessary) which includes a list of Thread Roots, and a list of
pairs of lock/unlock functions.  During the DSA pass, structures which are
accessed outside lock/unlock pairs are marked as being Unsafe.  At the end
of the pass, the Thread Root functions' DSA graphs are compared.  Those
structures which are accessed by multiple Thread Roots (and marked Unsafe)
are reported as potential race conditions.

	A separate pass may be needed to generate useful warning information
(where in each thread's call stack the potential race condition occurs) such
that a programmer can determine if the 'potential' is a real threat or not.

	My first examination at the code led me to
/llvm/Analysis/DataStructure/Local.cpp, which has a function
GraphBuilder::visitCallSite().  This appears to be the ideal spot to add a
hook for additional DSA processing, such as checking lock/unlock function
calls.  Also, hooks at the beginning and end of the pass, as well as a hook
in the visitor tree for setting flags.

	My question is; for a first-time project in LLVM, is this going to
get me over my head?  I have significant application & system design
experience, but no compiler experience.  Also, do you feel the data
generated from such a pass would be useful, or would there be too many
false-positives in a typical application?

Thanks!
--Bryan
bryan.turner at pobox.com