[cfe-dev] [analyzer][taint] More precise taint modelling on arrays

Balázs Benics via cfe-dev cfe-dev at lists.llvm.org
Wed Feb 12 05:08:46 PST 2020


Thank you for the clarification, Artem.

I'm going to have a look into that.

Artem Dergachev <noqnoqneo at gmail.com> ezt írta (időpont: 2020. febr. 11.,
K, 17:27):

> It's not much of a limitation, just a bug. When `scanf` produces a
> default conjured symbol binding in the buffer, this is *the* symbol that
> should carry the taint; all derived symbols will automatically be
> treated as tainted. If we're being even more precise and consider the
> length of the scanned string, then the derived symbol will need to carry
> partial taint (i.e., specify TaintedSubRegions such that derived<conj,
> R> is tainted iff R is a TaintedSubRegion of buf for conj). I.e., what
> i'm saying is that RegionStore is more-or-less flexible enough to
> implement this correctly, it's only a matter of getting this done. Those
> string functions are just too many and each of them requires individual
> attention.
>
> On 2/10/20 2:43 PM, Balázs Benics via cfe-dev wrote:
> > I made the following test case for checking the modeling of taint
> > propagation on the `strcpy` function.
> > As I observed, only the first byte of the array became tainted, even
> > though all bytes should be treated tainted.
> > In the test, you can see my expectations and the actual result.
> >
> > ```
> > void strcpy_unbounded_tainted_buffer(char *buf) {
> >   scanf("%s", buf);
> >
> >   char dst[32];
> >   strcpy(dst, buf);                       //  expected---vvv   vvv---
> > actual
> >   clang_analyzer_isTainted_char(dst[0]);  // expected-warning{{YES}} YES
> >   clang_analyzer_isTainted_char(dst[1]);  // expected-warning{{YES}} NO
> >   clang_analyzer_isTainted_char(dst[31]); // expected-warning{{YES}} NO
> > }
> >
> > void strcpy_bounded_tainted_buffer(char *buf) {
> >   scanf("%s", buf);
> >   buf[10] = '\0';
> >   clang_analyzer_isTainted_char(buf[0]);  // expected-warning{{YES}} YES
> >   clang_analyzer_isTainted_char(buf[1]);  // expected-warning{{YES}} NO
> >   clang_analyzer_isTainted_char(buf[10]); // expected-warning{{NO}}  NO
> >   clang_analyzer_isTainted_char(buf[20]); // expected-warning{{YES}} NO
> >
> >   char dst[32];
> >   strcpy(dst, buf);
> >   clang_analyzer_isTainted_char(dst[0]);  // expected-warning{{YES}} YES
> >   clang_analyzer_isTainted_char(dst[1]);  // expected-warning{{YES}} NO
> >   clang_analyzer_isTainted_char(dst[10]); // expected-warning{{NO}}  NO
> >   clang_analyzer_isTainted_char(dst[20]); // expected-warning{{NO}}  NO
> > }
> > ```
> >
> > Some clarification about `TaintedSubRegions` and tainting
> > `nonloc::LazyCompoundVal`s would be also helpful since it might be
> > related to this topic.
> >
> > What are the reasons for this limitation on modeling taintedness
> > regarding arrays?
> >
> >
> > Background and expectation:
> > This change would be the first step in migrating the diagnostic
> > emitting parts of the `GenericTaintChecker`.
> > Eg.: `checkUncontrolledFormatString`, `checkSystemCall`,
> > `checkTaintedBufferSize`.
> > As a result, multiple checkers will consume taintedness information
> > for reporting warnings in the future and letting the
> > `GenericTaintChecker` do only modeling and propagation.
> >
> > Regards, Balazs.
> >
> > _______________________________________________
> > cfe-dev mailing list
> > cfe-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200212/71d2d3e9/attachment-0001.html>


More information about the cfe-dev mailing list