<div dir="ltr">On Mon, Sep 30, 2013 at 8:33 PM, John Criswell <span dir="ltr"><<a href="mailto:criswell@illinois.edu" target="_blank">criswell@illinois.edu</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote">

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div bgcolor="#FFFFFF" text="#000000"><div class="im">

    <div>On 9/30/13 11:22 AM, Alexey Samsonov

      wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr"><br>

        <div class="gmail_extra">

          <div class="gmail_quote">On Mon, Sep 30, 2013 at 7:48 PM, John

            Criswell <span dir="ltr"><<a href="mailto:criswell@illinois.edu" target="_blank">criswell@illinois.edu</a>></span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div bgcolor="#FFFFFF" text="#000000">

                <div>

                  <div>

                    <div>On 9/30/13 9:40 AM, Alexey Samsonov wrote:<br>

                    </div>

                    <blockquote type="cite">

                      <div dir="ltr">Hi llvmdev!

                        <div><br>

                        </div>

                        <div>There are cases when we want our

                          instrumentation passes for Sanitizer tools to

                          insert llvm.memset.* calls (basically, we want

                          to mark certain region of user memory as

                          (un)addressable by writing magic values for

                          "shadow" of that memory region). llvm.memset

                          are convenient:</div>

                        <div>(1) we don't have to manually emit all

                          these n-byte stores in a cycle.</div>

                        <div>(2) llvm.memset can be inlined as a

                          platform-specific fast instructions (e.g.

                          SSE).</div>

                        <div>But there will be a problem if llvm.memset

                          is lowered into a regular memset() call:

                          sanitizer runtime libraries intercept all

                          memset() calls and treat them as function

                          calls made by user, in particular checking

                          that its arguments point to an addressable

                          "user" memory, not some sanitizer-specific

                          memory regions.</div>

                        <div><br>

                        </div>

                        <div>Can you suggest a way to ensure

                          llvm.memset() is not transformed into memset

                          function()? This intrinsic has

                          <isvolatile> argument, which limits

                          possible optimization of this call, does it

                          make sense to add yet another argument, that

                          would forbid transforming it into function

                          calls?</div>

                      </div>

                    </blockquote>

                    <br>

                  </div>

                </div>

                Dumb question: why not run the ASan instrumentation

                passes first and then run the pass that inserts the

                calls to llvm.memset()?<br>

                <br>

                Alternatively, why not put the llvm.memset and

                load/store instrumentation into a single pass?  That

                way, the pass can determine which memsets it added

                itself and which are ones from the original program that

                need instrumentation.<br>

              </div>

            </blockquote>

            <div><br>

            </div>

            <div>Sorry, I didn't understand your suggestions. Maybe I

              poorly described the problem. We need a way to teach

              CodeGen that some llvm.memset intrinsics can't be lowered

              into memset function call (those, that were added by ASan

              instrumentation pass), and some can (all the others).

              Otherwise the program would crash on ASan-added memset()

              at runtime.</div>

          </div>

        </div>

      </div>

    </blockquote>

    <br></div>

    Ah.  I think I see: you're not instrumenting memset(); you have a

    replacement memset() implementation in your run-time library.  As

    such, you don't want your calls to llvm.memset() to be changed into

    memset() because then they'll call your new implementation of

    memset().  Is that correct?<br></div></blockquote><div><br></div><div>Yes.</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">

    <br>

    I figured my question was dumb; I just didn't know why.<br>

    :)<br>

    <br>

    Assuming my understanding of the situation is correct, I don't

    really have a good answer for you.  You could try using vector

    stores instead of llvm.memset() and see if the optimizers/code

    generators don't change that into memset().</div></blockquote><div><br></div><div>This seems fragile, as you point out later.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div bgcolor="#FFFFFF" text="#000000">  If you can be more

    intrusive, you could add an attribute to llvm.memset() that tells

    the code generator not to change it to memset().  However, I don't

    have an idea of how to do it without changing LLVM and without doing

    something that might break in the future.<br></div></blockquote><div><br></div><div>I'm OK with changing LLVM, I just wonder what's the best strategy here - is it a magic llvm.memset-specific function attribute, or something more visible and intrusive like additional argument. I would be happy to find another alternatives, but don't see them at the moment...</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">

    <br>

    -- John T.<div class="im"><br>

    <br>

    <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div> </div>

            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div bgcolor="#FFFFFF" text="#000000"> <br>

                -- John T.<span><font color="#888888"><br>

                    <br>

                    <blockquote type="cite">

                      <div dir="ltr">

                        <div>

                          <div><br>

                          </div>

                          -- <br>

                          <div>Alexey Samsonov, MSK</div>

                        </div>

                      </div>

                      <br>

                      <fieldset></fieldset>

                      <br>

                      <pre>_______________________________________________

LLVM Developers mailing list

<a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a>

</pre>

                    </blockquote>

                    <br>

                  </font></span></div>

            </blockquote>

          </div>

          <br>

          <br clear="all">

          <div><br>

          </div>

          -- <br>

          <div>Alexey Samsonov, MSK</div>

        </div>

      </div>

    </blockquote>

    <br>

  </div></div>

</blockquote></div><br><br clear="all"><div><br></div>-- <br><div>Alexey Samsonov, MSK</div>

</div></div>