[cfe-dev] [analyzer] Expressing dependencies between arguments and return values
Peter Wu via cfe-dev
cfe-dev at lists.llvm.org
Wed May 9 01:05:10 PDT 2018
On Wed, May 09, 2018 at 03:37:12AM +0300, Artem Dergachev wrote:
>
> On 5/7/18 11:59 AM, Aleksei Sidorin via cfe-dev wrote:
> > Hello Peter,
> >
> > 07.05.2018 01:07, Peter Wu via cfe-dev пишет:
> > > Hi!
> > >
> > > I just got started with writing a Clang Static Analyzer checker,
> > > primarily to detect these classes of issues in C:
> > >
> > > - allocator/deallocator mismatch: e.g. "g_malloc" must be accompanied by
> > > "g_free" and not "free".
> > > - memory leak due to allocator mismatch: e.g. "g_list_new" must be
> > > accompanied by "g_list_free" and not "g_free".
> > > - memory leaks and use-after-free issues in general.
> > >
>
> The easiest way to accomplish these would be to extend the existing
> functionality of MallocChecker. It already warns about mismatching
> malloc()/operator delete etc. in c++ by tracking allocation families, so it
> most likely has all the functionality you need. It even already has support
> for g_malloc and g_free, but not for many other glib functions.
I did use MallocChecker as source of inspiration, but it has only one
level of differentiation. It can detect new vs delete[], but I need to
handle mismatches in project-specific allocators as well:
void *wmem_alloc(wmem_allocator_t *allocator, const size_t size);
void wmem_free(wmem_allocator_t *allocator, void *ptr);
where "allocator" is NULL, "wmem_epan_scope()", "wmem_file_scope()",
"wmem_packet_scope()" or something else. So this would be wrong:
void *p = wmem_alloc(wmem_file_scope(), 1);
wmem_free(NULL, p);
If other projects have similar allocation functions, I could look into
generalizing this sufficiently into function attributes, but at the
moment I am still exploring the best way to model this.
> > > I have an initial version working based on Anna's and Jordan's talk
>
> You might want to look into my workbook as well:
> https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf
I found that reference in an earlier message, thanks for writing this :)
Here is my current list of references:
https://github.com/Lekensteyn/clang-alloc-free-checker#developer-resources
> > > Consider this example:
> > >
> > > void checkIdentityFunctionLeak() {
> > > char *p = g_strdup("");
> > > char *q = g_strdelimit(p, "_", '-');
> > > } // expected-warning {{Memory leak}}
> > >
> > > To avoid false positives with memleak reporting, I added a pointerEscape
> > > callback that removes symbols. This however has as side-effect that "p"
> > > in the above example is not reported as leaked.
>
> MallocChecker has a relatively sophisticated mechanism to figure out if a
> certain function deserves a pointer escape or not. You might want to tweak
> it. The idea to provide a correct return value is a good call but it isn't
> necessary.
Looks it uses a lot of heuristics, and one of the assumptions is that
all functions in non-system-headers free memory. Modelling a function
more precisely (i.e. argument == return value) would make detecting
double-free/use-after-free possible.
> > The best way to tell the analyzer that some expression returns _exactly
> > same_ value as the argument is to bind the value of the argument as the
> > return result of this expression.
> >
> > ProgramStateRef State = C.getState();
> > SVal Arg0Val = Call.getArgSVal(0);
> > State = State->BindExpr(Call.getOriginExpr(),
> > C.getLocationContext(), Arg0Val);
> > C.addTransition(State);
Thanks, this did pass my test cases.
> > Our primary solver (RangeConstraintManager) has very limited
> > possibilities of handling complex symbolic expressions so it is better
> > to simplify them as much as possible.
>
> I'd like to add to this that you shouldn't do BindExpr in checkPostCall or
> checkPostStmt, only in evalCall. Because otherwise it'd conflict with other
> checkers that may try to model that function.
>
> For modeling string operations you should have a look at CStringChecker and
> extend it.
I'll look into it. If binding in PostCall gives bad results, would it be
an idea to add debug assertions that catches this?
> > > While the debug.ViewExplodedGraph option shows that the symbols are
> > > assumed to be the same in the CallExpr node:
> > >
> > > conj_$1{gpointer} : { [1, 18446744073709551615] }
> > > (conj_$4{gchar *}) - (conj_$1{gpointer}) : { [0, 0] }
> > >
> > > this range information is somehow lost in the next nodes.
> > >
> > > Another example which is currently not caught either (strstr returns an
> > > address from the region of "p"):
> > >
> > > int main() {
> > > char *p = strdup("xyz");
> > > if (!p) return 1;
> > > char *p2 = strstr(p, "z");
> > > free(p);
> > > puts(p2); // expected-warning {{Use-after-free}}
> > > }
> > >
> > > Have I misunderstood something in modelling this?Is it possible to
> > > express dependencies between the return value and the arguments of a
> > > library function?
> >
> > strstr() lacks modeling in analyzer so it just returns a conjured value.
> > A correct return value should be an ElementRegion whose base region is
> > the region pointed by 'p'. You can try construct this return value using
> > MemRegionManager methods.
> >
> > Hope this helps.
> >
> The high-level API for this stuff is SValBuilder::evalBinOp(), just add an
> integer to a pointer. See how CStringChecker handles functions that return
> the end of the string (look out for the returnEnd flag).
Thanks, I was already puzzled on how to use MemRegionManager, knowing
that evalBinOp/evalEQ can be used directly is a relief. The BindExpr
hint is spot-on, these are also used in cases such as
CStringChecker::evalCopyCommon and CStringChecker::evalStrsep.
Perhaps it is time for a new CSA presentation that incorporates this?
(hint, hint :))
--
Kind regards,
Peter Wu
https://lekensteyn.nl
More information about the cfe-dev
mailing list