<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    On 5/14/12 9:57 PM, Sai Charan wrote:

    <blockquote

cite="mid:CAJjy=iLehCati8zg4NJGybNSg+W_jGJ1yWTqSjMKE1pKJiVTsQ@mail.gmail.com"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html;

        charset=ISO-8859-1">

      <font face="tahoma,sans-serif">In the interest of time &

        effort, I am leaning on working at the LLVM IR level. </font>

      <div><font face="tahoma, sans-serif"><br>

        </font></div>

      <div><font face="tahoma, sans-serif">The code listing in section

          3.1 of the SoftBound paper is precisely what I am looking to

          do. However, the listing is at the C source level, while

          section 6 says that the implementation has been done on the

          LLVM IR; I don't see how I can figure out pointer

          de-references in LLVM IR. Every alloca/load/store is

          via <ty>*.</font></div>

      <div><font face="tahoma, sans-serif"><br>

        </font></div>

      <div><font face="tahoma, sans-serif">In summary, how do I figure

          out pointer de-references in LLVM IR.</font></div>

    </blockquote>

    <br>

    Ignoring intrinsic functions, the only LLVM IR instructions that

    dereference pointers are load and store.<br>

    <br>

    The intrinsics that access memory via pointers should be pretty easy

    to spot when you read through the LLVM Language Reference Manual:

    things like the atomic intrinsics, the string manipulating

    intrinsics, etc.<br>

    <br>

    You can see what SAFECode does by looking at the LoadStoreChecks.cpp

    source code.  You can probably find the equivalent code in the

    SoftBound code, but I do not know myself where it is.<br>

    <br>

    -- John T.<br>

    <br>

    <br>

    <blockquote

cite="mid:CAJjy=iLehCati8zg4NJGybNSg+W_jGJ1yWTqSjMKE1pKJiVTsQ@mail.gmail.com"

      type="cite">

      <div>

        <div><font face="tahoma,sans-serif"><br clear="all">

          </font><font face="tahoma, sans-serif">Sai Charan,</font>

          <div><font face="tahoma, sans-serif">CSE, UC Riverside.</font></div>

          <br>

          <br>

          <br>

          <div class="gmail_quote">On Mon, May 14, 2012 at 7:23 PM, John

            Criswell <span dir="ltr"><<a moz-do-not-send="true"

                href="mailto:criswell@illinois.edu" target="_blank">criswell@illinois.edu</a>></span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div bgcolor="#FFFFFF" text="#000000">

                <div>

                  <div class="h5"> On 5/14/12 8:11 PM, John McCall

                    wrote:

                    <blockquote type="cite">

                      <div>

                        <div>On May 14, 2012, at 5:59 PM, Sai Charan

                          wrote:</div>

                        <blockquote type="cite">

                          <div><font face="tahoma,sans-serif">I am

                              looking at using LLVM/Clang to

                              automatically convert pointer declarations

                              to fat pointers & the corresponding

                              dereferences to something appropriate. I

                              am looking for guidance on doing this.

                              Will an LLVM pass be better suited to this

                              or would this be better handled using

                              Clang. Any guidance on getting started

                              would be helpful.</font></div>

                        </blockquote>

                        <br>

                      </div>

                      <div>It would be best handled by modifying Clang,

                        both in semantic analysis (to change the size of

                        a pointer) and IR generation (to generate,

                        propagate, and consume your fat pointer values).

                         I'm afraid that clang's IR generation widely

                        assumes that pointers are represented as a

                        single llvm::Value, though, and you might be in

                        for a lot of work.</div>

                    </blockquote>

                    <br>

                  </div>

                </div>

                Converting to fat pointers can also be done at the LLVM

                IR level and, in fact, there's a modern implementation

                of fat pointers at the LLVM IR level in the SAFECode

                project (<a moz-do-not-send="true"

                  href="http://sva.cs.illinois.edu" target="_blank">http://sva.cs.illinois.edu</a>). 

                The implementation is SoftBound from University of

                Pennsylvania, and it implements what is essentially a

                fat pointer approach that does not modify data structure

                layout.  You can read about SoftBound at <a

                  moz-do-not-send="true"

                  href="http://www.cis.upenn.edu/acg/papers/pldi09_softbound.pdf"

                  target="_blank">http://www.cis.upenn.edu/acg/papers/pldi09_softbound.pdf</a>.<br>

                <br>

                One of the problems with implementing fat pointers

                within clang is that clang does not have the entire

                program, and so you cannot use whole program analysis to

                determine if parts of the program are aware of the data

                structure layout.  An LLVM IR analysis that is part of

                the link-time optimization framework can, and so a

                transform at the LLVM IR level could determine when it

                is safe to modify a data structure layout and when it is

                not.<br>

                <br>

                All that said, if you're using a fat pointer method that

                doesn't modify data structure layout (SoftBound has this

                feature; Xu et. al.'s work at <a moz-do-not-send="true"

href="http://seclab.cs.sunysb.edu/seclab/pubs/fse04.pdf" target="_blank">http://seclab.cs.sunysb.edu/seclab/pubs/fse04.pdf</a>

                doesn't either, IIRC), implementing it in Clang would

                also work.<br>

                <br>

                As an FYI, I'm advocating for a common infrastructure in

                LLVM for adding and optimizing memory safety run-time

                checks; the idea is to have common infrastructure that

                will work both for fat pointer approaches, object

                metadata approaches, and other approaches.  You can find

                my proposal at <a moz-do-not-send="true"

href="http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120507/142532.html"

                  target="_blank">http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120507/142532.html</a>. 

                I'd welcome any feedback or comments you may have on it.<br>

                <br>

                -- John T.<br>

                <br>

                <blockquote type="cite">

                  <div><br>

                  </div>

                  <div>John.</div>

                  <br>

                  <br>

                  <fieldset></fieldset>

                  <br>

                  <pre>_______________________________________________

cfe-dev mailing list

<a moz-do-not-send="true" href="mailto:cfe-dev@cs.uiuc.edu" target="_blank">cfe-dev@cs.uiuc.edu</a>

<a moz-do-not-send="true" href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a>

</pre>

                </blockquote>

                <br>

              </div>

            </blockquote>

          </div>

          <br>

        </div>

      </div>

    </blockquote>

    <br>

  </body>

</html>