[LLVMdev] GSoC Proposal: Inter-Procedure Program Slicing in LLVM

Mingliang LIU liuml07 at gmail.com
Sat Apr 27 12:59:50 PDT 2013


Hi all,

This is a GSoC 2013 proposal for LLVM project. Please see the formatted
version at here:
http://pacman.cs.tsinghua.edu.cn/~liuml07/files/gsoc2013-proposal-program-slicing.pdf

Program slicing has been used in many applications, the criteria of which
is a pair of statement and variables. I would like to write an
inter-procedural program slicing pass in LLVM, which is able to calculate C
program slices of source code effectively. There is no previous work
implemented in LLVM, which considers both the dynamic program slicing and
source code of the sliced program. Program slice contains all statements in
a program that directly or indirectly act the value of a
variable occurrence [5]. Program slicing has been used in many
applications, e.g., program verification, testing, maintenance, automatic
parallelization of program execution, automatic integration of program
versions.

While it's straightforward to implement the slicer in the back-end of
compiler using SSA form, the source code of the original program instead of
intermediate
representation is preferred in most cases. Moreover, we can further narrow
the notion of slice, which contains statements that influence the value of a
variable occurrence for speci c program inputs. This is referred as dynamic
program slicing [1]. However, there is no previous work implemented in LLVM
which solved the two problems.

There are two public projects which implement the backwards static slicing
in LLVM.

   - Giri Written by John Criswell from UIUC, a subproject of LLVM. The
   Giri code contains the static backwards intra-procedure slicing passes, and
   runs with an older version of LLVM. It also only backtracks until it hits a
   load. Additional code must be written to backtrack further to
   find potentially reaching stores.
   - LLVMSlicer This implementation is a complete static backwards slicer
   from Masaryk University. It works on the well de ned data and control flow
   equations in a white paper by F. Tip [4]. However, this code was written
   for special purpose, thus it's not general enough to be use by others. They
   implemented the Andersen's alias algorithm [2], callgraph, and modi es
   analysis to support the slicer, instead of using the LLVM APIs.


They eliminate the useless IR statements and keep the ones a ect the values
of the criteria. However, neither of them generates the compilable source
code
slice, which is heavily needed in reality. There are several ways to do
this. One is to generate the source code from sliced IR using llc tool. The
issue is that
the IR is not concerned with high-level semantic. The generated source code
is different from the original program and not suitable for human reading.
An-
other approach is deleting the source code according to sliced IR, with
line number information (in meta-data of each instruction). The naive
script deleting
sliced source code one by one fails to handle tricky cases. I think a
better source code slicer is to take use of the AST info.

There are three main parts of this work.

   1. First, borrow an implementation of static back-wards slicing from
   Giri or LLVMSlicer, and use the LLVM callgraph, mod/ref and alias
   interfaces as much as possible.
   2. Second, implement the dynamic program slicing using the approach 3 in
   the paper [1].
   3. Third, generate the source code of the sliced program. To make the
   sliced source code compile directly, we need to employ clang front-end


The final result of this summer of code is to make this pass work e
effectively and documented well. Further, I'll write test cases and behave
as the active
maintainer for this project. My long-term plan is to add more features,
e.g. Objective-C/C++ support, thing slicing [3], to this project.

Any comment is highly appreciated.


[1] H AGRAWAL. Dynamic Program Slicing. In ACM SIGPLAN Conference on
Programming Language Design and Implementation (PLDI),1990.
2] L.O. Andersen. Program analysis and specialization for the C programming
language. PhD thesis, University of Cophenhagen, Germany, 1994.
[3] M. Sridharan, S.J. Fink, and R. Bodik. Thin slicing. In PLDI'07, 2007.
[4] F. Tip. A survey of program slicing techniques.Journal of Programming
Languages, 1995.
[5] M. Weiser. Program slicing. In Proceedings of the 5th International
Conference on Software Engineering, ICSE, pages 439{449. IEEE, 1981.

--
Mingliang LIU (刘明亮 in Chinese)

PACMAN Group,  Dept. of Computer Science & Technology
Tsinghua University, Beijing 100084, China
Email: liuml07 at mails.tsinghua.edu.cn
Homepage: http://pacman.cs.tsinghua.edu.cn/~liuml07/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130428/b04cbf23/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gsoc2013-proposal-program-slicing.pdf
Type: application/pdf
Size: 114427 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130428/b04cbf23/attachment.pdf>


More information about the llvm-dev mailing list