[cfe-dev] C sequence-point analysis

David Blaikie dblaikie at gmail.com
Fri Nov 15 10:01:47 PST 2013


On Fri, Nov 15, 2013 at 9:52 AM, Lukas Hellebrandt <kamikazecz at gmail.com>wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Hi, David, thanks for your reply!
>
> If I understood UBSan correctly, it just adds run-time checks, so
> there is no error informing about possible undefined behavior before
> running the program (and even then, it might never occur).
>
> I don't really think there is a reasonable way to check this type of
> errors during runtime and even if there was, my goal is to issue a
> warning during compilation.
>
> What I can do in Clang is: check for this kind of error if I
> completely forget about pointers - I don't do any alias analysis and
> just assume no two different variables alias. So if I could just check
> in Clang whether they do alias or not, my problem would be solved.
>
> As I don't know about any way I could achieve this, I think about
> using LLVM for this (or combination of both, as i wrote: Clang to make
> a set of must-not-alias rules and LLVM to check these rules) which
> would still warn during compilation.
>
> As far as I understood, sanitizers are of no use here, am I correct?
>

If you specifically want static checking, then no, the sanitizers aren't
the right tool.

For static checking you have either Clang warnings and CFG, including the
existing -Wsequence-point warning which catches some cases of unsequenced
operations, or you have the Full Power of the Clang Static Analyzer which
might be where you could get some alias information to add more accuracy
(in the case of reduced false negatives that the Clang warning necessarily
has to have to be cheap enough to run at compile time).

- David


>
>
> *****************************
> Lukas Hellebrandt
> kamikazecz at gmail.com
> *****************************
>
> On 11/15/2013 06:29 PM, David Blaikie wrote:
> > You might want to look into the implementation of Clang's UBSan
> > feature (-fsanitize=undefined). Like the other sanitizers (address,
> > memory, and thread) UBSan works by adding extra checks into the
> > LLVM IR from the Clang frontend. LLVM compiles those checks as it
> > would any other IR and they are used to verify that the behavior is
> > correct.
> >
> > The simplest example of this might be bounds checking of a static
> > array (and this is one of the things UBSan can check for) or
> > overflow of a signed integer. The frontend simply adds the checks a
> > programmer might write if they were coding defensively against such
> > a circumstance. Then LLVM just compiles the code as normal and when
> > you execute the program, if you trigger the check to fail, an error
> > message is printed.
> >
> > For the particular check you're interested in implementing... I'm
> > not sure exactly how you'll go about implementing that check or how
> > you'd avoid false positives, but UBSan is probably the first place
> > to look and the ideal place for this to live, if possible.
> >
> >
> > On Fri, Nov 15, 2013 at 9:21 AM, Lukas Hellebrandt
> > <kamikazecz at gmail.com <mailto:kamikazecz at gmail.com>> wrote:
> >
> > Hi all,
> >
> > I'm trying to write a tool for detecting undefined behavior in C
> > regarding sequence points and side effects.
> >
> > I'm not sure whether it should be a Clang plugin or LLVM run or
> > something completely different (I'm really new to both Clang and
> > LLVM) and that's what I need advice with.
> >
> > For my work, I need to use both AST and alias analysis
> >
> > Clang plugin: +relatively easy to use +access to AST with all the
> > needed info EXCEPT alias analysis (right?) -no alias analysis, I'd
> > need to write one myself
> >
> > LLVM run: +built-in alias analysis (I'd like to use it, writing my
> > own alias analysis is not really what my work is all about) -I do
> > NOT have access to AST -I don't know it at all (but I'm ready to
> > learn it if it shows up to be the best option)
> >
> > The big PROBLEM is: a behavior that is undefined in C (and which
> > Clang has access to) might be (and in my case WILL be) well defined
> > in LLVM (for example, i=i++; is undefined in C but in LLVM code it
> > will be already well defined and the result will depend on Clang
> > behavior).
> >
> > So I thought I could use both, in Clang create a list of rules,
> > for example "On line L, there is an undefined behavior if X aliases
> > with Y" and then SOMEHOW dig this info from LLVM run.
> >
> > Is this a good idea? Is there a way (other than output to file
> > from Clang and then read it in LLVM) to give this set of rules to
> > LLVM? I'd also be glad for any other idea, not necessarily
> > including LLVM and Clang, to solve this problem.
> >
> > Thanks in advance!
> >
> > -- ***************************** Lukas Hellebrandt
> > kamikazecz at gmail.com <mailto:kamikazecz at gmail.com>
> > *****************************
> > _______________________________________________ cfe-dev mailing
> > list cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu>
> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> >
> >
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.13 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iF4EAREIAAYFAlKGX2AACgkQHFHs/Czs0u57UAD/QIOZT1b800EasNX4sH2IBLER
> 2/ZqRyD+F8fAdzgOqbQBAJnjlFWXDG8e+IsTQ95HOXqGyEArvmR2Dt6tut+4CCCW
> =C2Pn
> -----END PGP SIGNATURE-----
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20131115/0bae26dd/attachment.html>


More information about the cfe-dev mailing list