[llvm-dev] How to ask MustAlias queries from DSA results
John Criswell via llvm-dev
llvm-dev at lists.llvm.org
Sun Dec 18 09:38:04 PST 2016
On 12/17/16 9:55 PM, 杨至轩(Zhixuan Yang) wrote:
> Dear Josh,
>
>
> > If I understand correctly, if you find memory leak, you want to
> find the corresponding call(s) to malloc() that allocated the
> memory object, correct? Can you more completely explain what you
> are trying to accomplish?
>
>
> Thanks for your reply. In my task, I use data flow analysis to locate
> a program point where a malloc must be leaked (by must leaked, I mean
> (a) it must be allocated, (b) must not be free()d and (c) never used
> in the future). And I want to fix this leak by finding a pointer must
> point to that malloc(). So I want to perform a must-alias query.
When you say "must be allocated," you mean it must have been allocated
via a call to a heap allocator (e.g., malloc(), calloc(), etc), correct?
Technically, global variables and stack variables also allocated; they
just don't allocate heap memory.
Also, are you performing intra-procedural or inter-procedural data-flow
analysis?
>
> >However, DSA is a unification-based analysis, so I would think
> that the accuracy of a must-alias feature would be pretty weak.
> Also, DSA loses precision as it performs more inter-procedural
> analysis (the local analysi>s will be the most precise but will
> have many Incomplete DSNodes; the Bottom-Up and Top-Down propagate
> information up and down the call graph but will cause further
> DSNode merging).
>
>
> Thanks for your clarification. I agree with you. Even if we
> implemented a MustAlias interface in DSA, it will be too weak.
>
>
> >It may be that you will need a more accurate points-to analysis
> algorithm for your work.
>
>
> In fact, my task can be solved in a simpler (while less elegant) way.
> If I want to find pointers must-alias with a malloc() call, I can
> create a new variable storing the result returned by the malloc() when
> it is called.
This is essentially a fat pointer; you're extending the pointer that
you're checking to contain the base address of the memory object to
which it points as well as the memory address to which it points. Since
you're not adding the base address to the pointer but passing it around
with the pointer, you must transform the code so that the base address
"follows" the pointer value wherever it goes (into memory, passed to
functions as arguments, etc).
Fat pointers are relatively easy for local variables but are much more
of a pain for pointers that are stored to/read from memory or passed to
functions as arguments. I'm also of the opinion that every fat pointer
approach suffers from some degree of compatibility problems with
third-party library code (the infamous "external code" problem).
If you're going to transform the program, I would recommend that you use
SAFECode's new BBAC feature to track the base address. BBAC has a
run-time library which can take a pointer to a memory object and
calculate, in constant time, the first address of the memory object into
which the pointer is pointing. You could use this to find the base
address of the memory object so that you can pass it to the free()
function. As BBAC is a referent object approach, it doesn't suffer from
the compatibility problems that fat pointer approaches suffer.
My Google Summer of Code student, Zhengyang Liu, worked on BBAC this
summer and created an updated and robust implementation of it that you
could modify for your project. If you're interested, please email me so
that I can put you in touch with him.
Regards,
John Criswell
> Thanks for your help.
>
> Best regards, Zhixuan Yang
--
John Criswell
Assistant Professor
Department of Computer Science, University of Rochester
http://www.cs.rochester.edu/u/criswell
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161218/6e9bfe54/attachment.html>
More information about the llvm-dev
mailing list