<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
On 5/14/12 9:57 PM, Sai Charan wrote:
<blockquote
cite="mid:CAJjy=iLehCati8zg4NJGybNSg+W_jGJ1yWTqSjMKE1pKJiVTsQ@mail.gmail.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<font face="tahoma,sans-serif">In the interest of time &
effort, I am leaning on working at the LLVM IR level. </font>
<div><font face="tahoma, sans-serif"><br>
</font></div>
<div><font face="tahoma, sans-serif">The code listing in section
3.1 of the SoftBound paper is precisely what I am looking to
do. However, the listing is at the C source level, while
section 6 says that the implementation has been done on the
LLVM IR; I don't see how I can figure out pointer
de-references in LLVM IR. Every alloca/load/store is
via <ty>*.</font></div>
<div><font face="tahoma, sans-serif"><br>
</font></div>
<div><font face="tahoma, sans-serif">In summary, how do I figure
out pointer de-references in LLVM IR.</font></div>
</blockquote>
<br>
Ignoring intrinsic functions, the only LLVM IR instructions that
dereference pointers are load and store.<br>
<br>
The intrinsics that access memory via pointers should be pretty easy
to spot when you read through the LLVM Language Reference Manual:
things like the atomic intrinsics, the string manipulating
intrinsics, etc.<br>
<br>
You can see what SAFECode does by looking at the LoadStoreChecks.cpp
source code. You can probably find the equivalent code in the
SoftBound code, but I do not know myself where it is.<br>
<br>
-- John T.<br>
<br>
<br>
<blockquote
cite="mid:CAJjy=iLehCati8zg4NJGybNSg+W_jGJ1yWTqSjMKE1pKJiVTsQ@mail.gmail.com"
type="cite">
<div>
<div><font face="tahoma,sans-serif"><br clear="all">
</font><font face="tahoma, sans-serif">Sai Charan,</font>
<div><font face="tahoma, sans-serif">CSE, UC Riverside.</font></div>
<br>
<br>
<br>
<div class="gmail_quote">On Mon, May 14, 2012 at 7:23 PM, John
Criswell <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:criswell@illinois.edu" target="_blank">criswell@illinois.edu</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>
<div class="h5"> On 5/14/12 8:11 PM, John McCall
wrote:
<blockquote type="cite">
<div>
<div>On May 14, 2012, at 5:59 PM, Sai Charan
wrote:</div>
<blockquote type="cite">
<div><font face="tahoma,sans-serif">I am
looking at using LLVM/Clang to
automatically convert pointer declarations
to fat pointers & the corresponding
dereferences to something appropriate. I
am looking for guidance on doing this.
Will an LLVM pass be better suited to this
or would this be better handled using
Clang. Any guidance on getting started
would be helpful.</font></div>
</blockquote>
<br>
</div>
<div>It would be best handled by modifying Clang,
both in semantic analysis (to change the size of
a pointer) and IR generation (to generate,
propagate, and consume your fat pointer values).
I'm afraid that clang's IR generation widely
assumes that pointers are represented as a
single llvm::Value, though, and you might be in
for a lot of work.</div>
</blockquote>
<br>
</div>
</div>
Converting to fat pointers can also be done at the LLVM
IR level and, in fact, there's a modern implementation
of fat pointers at the LLVM IR level in the SAFECode
project (<a moz-do-not-send="true"
href="http://sva.cs.illinois.edu" target="_blank">http://sva.cs.illinois.edu</a>).
The implementation is SoftBound from University of
Pennsylvania, and it implements what is essentially a
fat pointer approach that does not modify data structure
layout. You can read about SoftBound at <a
moz-do-not-send="true"
href="http://www.cis.upenn.edu/acg/papers/pldi09_softbound.pdf"
target="_blank">http://www.cis.upenn.edu/acg/papers/pldi09_softbound.pdf</a>.<br>
<br>
One of the problems with implementing fat pointers
within clang is that clang does not have the entire
program, and so you cannot use whole program analysis to
determine if parts of the program are aware of the data
structure layout. An LLVM IR analysis that is part of
the link-time optimization framework can, and so a
transform at the LLVM IR level could determine when it
is safe to modify a data structure layout and when it is
not.<br>
<br>
All that said, if you're using a fat pointer method that
doesn't modify data structure layout (SoftBound has this
feature; Xu et. al.'s work at <a moz-do-not-send="true"
href="http://seclab.cs.sunysb.edu/seclab/pubs/fse04.pdf" target="_blank">http://seclab.cs.sunysb.edu/seclab/pubs/fse04.pdf</a>
doesn't either, IIRC), implementing it in Clang would
also work.<br>
<br>
As an FYI, I'm advocating for a common infrastructure in
LLVM for adding and optimizing memory safety run-time
checks; the idea is to have common infrastructure that
will work both for fat pointer approaches, object
metadata approaches, and other approaches. You can find
my proposal at <a moz-do-not-send="true"
href="http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120507/142532.html"
target="_blank">http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120507/142532.html</a>.
I'd welcome any feedback or comments you may have on it.<br>
<br>
-- John T.<br>
<br>
<blockquote type="cite">
<div><br>
</div>
<div>John.</div>
<br>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
cfe-dev mailing list
<a moz-do-not-send="true" href="mailto:cfe-dev@cs.uiuc.edu" target="_blank">cfe-dev@cs.uiuc.edu</a>
<a moz-do-not-send="true" href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a>
</pre>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
<br>
</body>
</html>