[LLVMdev] static taint analysis in LLVM

Tue Jul 21 07:13:34 PDT 2015

Well, if you're working with compiler infrastructure and you're not
familiar with how to build a custom pass, just know that compilers are
extremely sophisticated and very difficult. But I'm not trying to
discourage you, so do you think maybe it would be more suitable to limit
your thesis to some subtopic, like just making the shadow memory
runtime/library, that would certainly be a worthwhile endeavor. Lots of
different utilities do taint analysis, having a single reusable component
that could be high quality would be very attractive. You could reuse it
with PANDA, Intel PIN, a compiler based instrumentation pass, or DynamoRio.
That would be attractive. It could be malleable and offer different modes
of operating, such as asynchronous or blocking, have different shadow
memory representations, such as offset-range encodings or just a simple
bitmap and support different identity schemes. That would be valuable and
realizable within your constraints. There are existing taint analysis
mechanisms that aren't static, but you might be able to lean from those.

The actual mechanism by which to determine a propagation of taint itself
though is really hard to nail down. It's usually some tradeoff of
performance vs precision, but in general making a determination about
whether a clobber has occurred on the fly would be hard, since it an be
determined by a function of the value local to the operation you're trying
to reason about. There was some research oh dynamic flow analysis that's
pertinent to this, but if you're struggling to meet a deadline and also
working on picking up the compiler infrastructure, this would definitely
over laden you.

On Tue, Jul 21, 2015 at 4:38 AM, Q Z <zhaoqian301 at gmail.com> wrote:

> Thank you very much for you nice reply.
> I have red some parts of LLVM documents, but not all. However, I think I
> have no time to read more documents. because I must complete my work almost
> 40 days later.
>
> I want to writer a simple checker to check a OS(wrote by C) to determine
> if it has buffer overflow(or more) vulnerability using LLVM. And I want to
> write it as a LLVM pass. I think static taint analsiy technique can solve
> it.
>
> Limited by time, I need a static taint analysis example to imitation and
> improve. I hope the example should using LLVM, and it must have well
> annotation because coding is a difficult thing for me.
>
> I have found a example : https://github.com/thinkmoore/llvm-deps.
> but it has little annotations(I have send email to the wirter, but I
> haven't receive reply). I have implement "sourcesinkanalysis" parts as a
> LLVM pass by myself. but other parts is difficult for me without
> annotation.
> so if you have a better examples, or you have some better suggestions for
> my work. Please tell me. Thank you very much!
>
> best wishes ,
> zhaoqian
>
> 2015-07-17 22:07 GMT+08:00 Kenneth Adam Miller <
> kennethadammiller at gmail.com>:
>
>> It appears that you've not done the requisite reading that's highlighted
>> multiple times in the very beginning of the document. Compilers are
>> extremely sophisticated and hard; the assumed proclivity for self learning
>> here is high, so if you don't demonstrate that you've done your homework it
>> will probably be hard to solicit support.
>>
>> In any case, not that I know of, and I've been on a stride for the past
>> while, reading about and learning llvm. You can see all the different
>> passes that are publicly available in the documentation as well. In
>> addition, since I've already told you that it's not there, I might as well
>> help you out by saying that the way llvm works is as a library in that when
>> you want your pass to be executed, you register the code with the overall
>> framework. In your case, you'll need two things, from a higher level
>> perspective-a way to insert some form of a runtime/library regarding how
>> the shadow memory is maintained. My perspective regarding this is you can
>> construct your own shadow memory functionality as a shared object that is
>> loaded and initialized via a companion preamble to main, and produce
>> compiled executables that implicitly use this (or edit the compilation
>> behavior of your targets, more tedious). And the other thing you need is to
>> weave in the calls, or inline, the work necessary to maintain the shadow
>> memory. I recommend writing a pass that will work at basic block
>> granularity level, because at that point you can array the memory
>> operations to facilitate liveness of the shadow memory callback or offset
>> information, thereby further streamlining the efficiency of the final code
>> by combining shadow memory maintenance work.
>>
>> This book is good for getting started: Getting Started with LLVM Core
>> Libraries. It has lots of examples, but to be honest, you don't need to pay
>> for anything until you've read what's publicly available, and llvm even
>> comes with examples.
>>
>> Let me know your thoughts and we can pick up when you've seen the passes
>> and learned about how to extend the correct C++ class.
>>
>> On Thu, Jul 16, 2015 at 8:44 PM, Q Z <zhaoqian301 at gmail.com> wrote:
>>
>>> Hello,
>>> I want to know if LLVM support static taint analysis now ? and how to
>>> implement static taint analysis code in term of LLVM pass or something else
>>> ?
>>>
>>> can anyone help me?Thank you very much!
>>>
>>> zhaoqian
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150721/7a079aaf/attachment.html>