[LLVMdev] How to make sure llvm.memset intrinsic is not lowered into memset() call?

Tue Oct 1 04:39:09 PDT 2013

On Mon, Sep 30, 2013 at 8:33 PM, John Criswell <criswell at illinois.edu>wrote:

>  On 9/30/13 11:22 AM, Alexey Samsonov wrote:
>
>
>  On Mon, Sep 30, 2013 at 7:48 PM, John Criswell <criswell at illinois.edu>wrote:
>
>>   On 9/30/13 9:40 AM, Alexey Samsonov wrote:
>>
>> Hi llvmdev!
>>
>>  There are cases when we want our instrumentation passes for Sanitizer
>> tools to insert llvm.memset.* calls (basically, we want to mark certain
>> region of user memory as (un)addressable by writing magic values for
>> "shadow" of that memory region). llvm.memset are convenient:
>> (1) we don't have to manually emit all these n-byte stores in a cycle.
>> (2) llvm.memset can be inlined as a platform-specific fast instructions
>> (e.g. SSE).
>> But there will be a problem if llvm.memset is lowered into a regular
>> memset() call: sanitizer runtime libraries intercept all memset() calls and
>> treat them as function calls made by user, in particular checking that its
>> arguments point to an addressable "user" memory, not some
>> sanitizer-specific memory regions.
>>
>>  Can you suggest a way to ensure llvm.memset() is not transformed into
>> memset function()? This intrinsic has <isvolatile> argument, which limits
>> possible optimization of this call, does it make sense to add yet another
>> argument, that would forbid transforming it into function calls?
>>
>>
>>  Dumb question: why not run the ASan instrumentation passes first and
>> then run the pass that inserts the calls to llvm.memset()?
>>
>> Alternatively, why not put the llvm.memset and load/store instrumentation
>> into a single pass?  That way, the pass can determine which memsets it
>> added itself and which are ones from the original program that need
>> instrumentation.
>>
>
>  Sorry, I didn't understand your suggestions. Maybe I poorly described
> the problem. We need a way to teach CodeGen that some llvm.memset
> intrinsics can't be lowered into memset function call (those, that were
> added by ASan instrumentation pass), and some can (all the others).
> Otherwise the program would crash on ASan-added memset() at runtime.
>
>
> Ah.  I think I see: you're not instrumenting memset(); you have a
> replacement memset() implementation in your run-time library.  As such, you
> don't want your calls to llvm.memset() to be changed into memset() because
> then they'll call your new implementation of memset().  Is that correct?
>

Yes.

>
> I figured my question was dumb; I just didn't know why.
> :)
>
> Assuming my understanding of the situation is correct, I don't really have
> a good answer for you.  You could try using vector stores instead of
> llvm.memset() and see if the optimizers/code generators don't change that
> into memset().
>

This seems fragile, as you point out later.

>   If you can be more intrusive, you could add an attribute to
> llvm.memset() that tells the code generator not to change it to memset().
> However, I don't have an idea of how to do it without changing LLVM and
> without doing something that might break in the future.
>

I'm OK with changing LLVM, I just wonder what's the best strategy here - is
it a magic llvm.memset-specific function attribute, or something more
visible and intrusive like additional argument. I would be happy to find
another alternatives, but don't see them at the moment...

>
> -- John T.
>
>
>
>
>
>>
>> -- John T.
>>
>>
>>  --
>> Alexey Samsonov, MSK
>>
>>
>> _______________________________________________
>> LLVM Developers mailing listLLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>>
>
>
>  --
> Alexey Samsonov, MSK
>
>
>

-- 
Alexey Samsonov, MSK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131001/9e2a3722/attachment.html>