[LLVMdev] How to make sure llvm.memset intrinsic is not lowered into memset() call?

John Criswell criswell at illinois.edu
Mon Sep 30 09:33:03 PDT 2013


On 9/30/13 11:22 AM, Alexey Samsonov wrote:
>
> On Mon, Sep 30, 2013 at 7:48 PM, John Criswell <criswell at illinois.edu 
> <mailto:criswell at illinois.edu>> wrote:
>
>     On 9/30/13 9:40 AM, Alexey Samsonov wrote:
>>     Hi llvmdev!
>>
>>     There are cases when we want our instrumentation passes for
>>     Sanitizer tools to insert llvm.memset.* calls (basically, we want
>>     to mark certain region of user memory as (un)addressable by
>>     writing magic values for "shadow" of that memory region).
>>     llvm.memset are convenient:
>>     (1) we don't have to manually emit all these n-byte stores in a
>>     cycle.
>>     (2) llvm.memset can be inlined as a platform-specific fast
>>     instructions (e.g. SSE).
>>     But there will be a problem if llvm.memset is lowered into a
>>     regular memset() call: sanitizer runtime libraries intercept all
>>     memset() calls and treat them as function calls made by user, in
>>     particular checking that its arguments point to an addressable
>>     "user" memory, not some sanitizer-specific memory regions.
>>
>>     Can you suggest a way to ensure llvm.memset() is not transformed
>>     into memset function()? This intrinsic has <isvolatile> argument,
>>     which limits possible optimization of this call, does it make
>>     sense to add yet another argument, that would forbid transforming
>>     it into function calls?
>
>     Dumb question: why not run the ASan instrumentation passes first
>     and then run the pass that inserts the calls to llvm.memset()?
>
>     Alternatively, why not put the llvm.memset and load/store
>     instrumentation into a single pass?  That way, the pass can
>     determine which memsets it added itself and which are ones from
>     the original program that need instrumentation.
>
>
> Sorry, I didn't understand your suggestions. Maybe I poorly described 
> the problem. We need a way to teach CodeGen that some llvm.memset 
> intrinsics can't be lowered into memset function call (those, that 
> were added by ASan instrumentation pass), and some can (all the 
> others). Otherwise the program would crash on ASan-added memset() at 
> runtime.

Ah.  I think I see: you're not instrumenting memset(); you have a 
replacement memset() implementation in your run-time library.  As such, 
you don't want your calls to llvm.memset() to be changed into memset() 
because then they'll call your new implementation of memset().  Is that 
correct?

I figured my question was dumb; I just didn't know why.
:)

Assuming my understanding of the situation is correct, I don't really 
have a good answer for you.  You could try using vector stores instead 
of llvm.memset() and see if the optimizers/code generators don't change 
that into memset().  If you can be more intrusive, you could add an 
attribute to llvm.memset() that tells the code generator not to change 
it to memset().  However, I don't have an idea of how to do it without 
changing LLVM and without doing something that might break in the future.

-- John T.


>
>     -- John T.
>
>>
>>     -- 
>>     Alexey Samsonov, MSK
>>
>>
>>     _______________________________________________
>>     LLVM Developers mailing list
>>     LLVMdev at cs.uiuc.edu  <mailto:LLVMdev at cs.uiuc.edu>          http://llvm.cs.uiuc.edu
>>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
>
> -- 
> Alexey Samsonov, MSK

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130930/43807a2d/attachment.html>


More information about the llvm-dev mailing list