[LLVMdev] How to make sure llvm.memset intrinsic is not lowered into memset() call?
Alexey Samsonov
samsonov at google.com
Tue Oct 1 04:39:09 PDT 2013
On Mon, Sep 30, 2013 at 8:33 PM, John Criswell <criswell at illinois.edu>wrote:
> On 9/30/13 11:22 AM, Alexey Samsonov wrote:
>
>
> On Mon, Sep 30, 2013 at 7:48 PM, John Criswell <criswell at illinois.edu>wrote:
>
>> On 9/30/13 9:40 AM, Alexey Samsonov wrote:
>>
>> Hi llvmdev!
>>
>> There are cases when we want our instrumentation passes for Sanitizer
>> tools to insert llvm.memset.* calls (basically, we want to mark certain
>> region of user memory as (un)addressable by writing magic values for
>> "shadow" of that memory region). llvm.memset are convenient:
>> (1) we don't have to manually emit all these n-byte stores in a cycle.
>> (2) llvm.memset can be inlined as a platform-specific fast instructions
>> (e.g. SSE).
>> But there will be a problem if llvm.memset is lowered into a regular
>> memset() call: sanitizer runtime libraries intercept all memset() calls and
>> treat them as function calls made by user, in particular checking that its
>> arguments point to an addressable "user" memory, not some
>> sanitizer-specific memory regions.
>>
>> Can you suggest a way to ensure llvm.memset() is not transformed into
>> memset function()? This intrinsic has <isvolatile> argument, which limits
>> possible optimization of this call, does it make sense to add yet another
>> argument, that would forbid transforming it into function calls?
>>
>>
>> Dumb question: why not run the ASan instrumentation passes first and
>> then run the pass that inserts the calls to llvm.memset()?
>>
>> Alternatively, why not put the llvm.memset and load/store instrumentation
>> into a single pass? That way, the pass can determine which memsets it
>> added itself and which are ones from the original program that need
>> instrumentation.
>>
>
> Sorry, I didn't understand your suggestions. Maybe I poorly described
> the problem. We need a way to teach CodeGen that some llvm.memset
> intrinsics can't be lowered into memset function call (those, that were
> added by ASan instrumentation pass), and some can (all the others).
> Otherwise the program would crash on ASan-added memset() at runtime.
>
>
> Ah. I think I see: you're not instrumenting memset(); you have a
> replacement memset() implementation in your run-time library. As such, you
> don't want your calls to llvm.memset() to be changed into memset() because
> then they'll call your new implementation of memset(). Is that correct?
>
Yes.
>
> I figured my question was dumb; I just didn't know why.
> :)
>
> Assuming my understanding of the situation is correct, I don't really have
> a good answer for you. You could try using vector stores instead of
> llvm.memset() and see if the optimizers/code generators don't change that
> into memset().
>
This seems fragile, as you point out later.
> If you can be more intrusive, you could add an attribute to
> llvm.memset() that tells the code generator not to change it to memset().
> However, I don't have an idea of how to do it without changing LLVM and
> without doing something that might break in the future.
>
I'm OK with changing LLVM, I just wonder what's the best strategy here - is
it a magic llvm.memset-specific function attribute, or something more
visible and intrusive like additional argument. I would be happy to find
another alternatives, but don't see them at the moment...
>
> -- John T.
>
>
>
>
>
>>
>> -- John T.
>>
>>
>> --
>> Alexey Samsonov, MSK
>>
>>
>> _______________________________________________
>> LLVM Developers mailing listLLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>>
>
>
> --
> Alexey Samsonov, MSK
>
>
>
--
Alexey Samsonov, MSK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131001/9e2a3722/attachment.html>
More information about the llvm-dev
mailing list