[cfe-dev] Unions in the Clang Static Analyzer

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Mon Aug 20 16:37:36 PDT 2018


Yup, we could use tracking a-la taint analysis in checkers for union 
based undefined values, or in all checkers for undefined values.

I don't think this is blocking union support, even in the form of 
technical debt. We may have union support without emitting 
union-specific warnings at all. Modeling defined-but-weird values 
consistently is the difficult part here.

On 8/20/18 3:18 PM, Gábor Horváth wrote:
> Sorry for the slightly off topic, but I it always bothered me that we 
> use UndefinedVal to represent uninitialized values (and any kind of 
> undefined value regardless of the reason it is being undefined).
> If somebody volunteered to clean this up, we could have the 
> union-specific undefined values as an opt-in thing.
> While this would probably be a lot of work, it could be a 0th step 
> towards proper union support (though arguably this might not be the 
> best return on invest).
>
> On Mon, 20 Aug 2018 at 15:08, Artem Dergachev via cfe-dev 
> <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>
>     Ideally we should implement things according to the standard,
>     producing an UndefinedVal() every time the value is undefined.
>     This will also force us to teach our uninitialized value checkers
>     to understand why the value is uninitialized.
>
>     However this approach would prevent us from making union-specific
>     undefined values an opt-in thing. And i believe that we might
>     eventually want that because not all developers are worried about
>     undefined behavior; there might be codebases in which it is normal
>     to believe that on a specific platform type-punning is always
>     going to happen, even in C++.
>
>     So i guess we'll have to stick to believing in type punning as
>     much as possible and never produce UndefinedVal()s in ExprEngine
>     but let union-specific checkers decide if we want to emit warnings
>     on any of those.
>
>     ---
>
>     When it comes to representing symbolic values loaded from union
>     fields (the problem explained as
>     http://lists.llvm.org/pipermail/cfe-dev/2017-March/052881.html) i
>     don't know how exactly do i want to represent them, but i would
>     imagine producing a new sort of symbol (let's call it
>     "SymbolUnionValue") that has the following as its identity:
>
>       - MemRegion of the union.
>       - Unsigned integer type of the same size as the read.
>
>     Then the read can be represented as the result of a cast from the
>     value to the actual read type. This way we'd discriminate between
>     reads of different size but we'll know that reads of the same size
>     would produce the same value up to casts. This reduces the problem
>     to the problem of representing casts (which will likely be your
>     second problem). Derived symbols would need to be handled
>     similarly (dunno how to call the corresponding derived union
>     symbols). It is safe to assume that every read can be represented
>     as an integer read of a certain type because otherwise we can
>     always return a LazyCompoundValue that doesn't have any identity
>     problems whatsoever (because it doesn't promise anything) or an
>     UnknownVal for types that aren't supported anyway (like floats).
>     If you enjoy the idea of using our experimental Z3 refutation, it
>     shouldn't be hard to teach a clever solver how exactly reads of
>     different sizes are related to each other, but our usual
>     RangeConstraintManager wouldn't be able to reason about this stuff
>     anyway, you shouldn't bother.
>
>     Supporting loads/stores of concrete values should be
>     straightforward but may still need some work.
>
>     On 8/19/18 5:52 PM, Keno Fischer wrote:
>>     Sorry for bumping an old thread, but I ended up trying to pick up
>>     the project that brought this up again after
>>     giving up last year because of the analyzer's lacking union
>>     support. Naturally I ran into the same problems again :/.
>>     It seems like the first step here is to decide what model of
>>     unions the analyzer intends to support. I'd be
>>     happy to try fix some things here, but of course those decisions
>>     need to be made first.
>>
>>     On Fri, Mar 3, 2017 at 2:12 AM, Aaron Ballman
>>     <aaron at aaronballman.com <mailto:aaron at aaronballman.com>> wrote:
>>
>>         On Thu, Mar 2, 2017 at 8:54 PM, Keno Fischer
>>         <keno at juliacomputing.com <mailto:keno at juliacomputing.com>> wrote:
>>         > This behavior also differs between C and C++. For example,
>>         C has explicitly:
>>         >
>>         > [N1570, Footnote 95]
>>         > If the member used to read the contents of a union object
>>         is not the
>>         > same as the member last used to store a value in the
>>         object, the
>>         > appropriate part of the object representation of the value is
>>         > reinterpreted as an object representation in the new type
>>         as described
>>         > in 6.2.6 (a process sometimes called ‘‘type punning’’).
>>         This might be
>>         > a trap representation.
>>         >
>>         > where object representation is essentially what you'd get
>>         if you
>>         > memcpyed the value.
>>
>>         Yes, if everything lines up just perfectly, then in C this may be
>>         okay. I say may because it can still be a trap representation, it
>>         could be a misaligned access, there could be unspecified
>>         values within
>>         the union member, etc (so it depends on the types and the
>>         specific
>>         values), so it is *not* strictly a safe operation to do in C
>>         either.
>>
>>         ~Aaron
>>
>>
>
>     _______________________________________________
>     cfe-dev mailing list
>     cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180820/596e63f5/attachment.html>


More information about the cfe-dev mailing list