[cfe-dev] Unions in the Clang Static Analyzer

Aaron Ballman via cfe-dev cfe-dev at lists.llvm.org
Thu Mar 2 21:49:07 PST 2017


On Thu, Mar 2, 2017 at 1:11 PM, Alexey Sidorin <alexey.v.sidorin at ya.ru> wrote:
> Hi all,
>
> There is one thing that can make the implementation a bit easier :)
> The reading of a union field that was not stored last may cause undefined
> behavior. A discussion on this issue can be found on
> https://www.securecoding.cert.org/confluence/display/c/EXP39-C.+Do+not+access+a+variable+through+a+pointer+of+an+incompatible+type
>
> I guess we need some language lawyers here. Aaron, can I disturb you?

Certainly -- is there advice in particular you're looking for? I'm not
overly familiar with the static analyzer design, but I'm to help as I
can.

A note below...

>
>
> 02.03.2017 20:12, Artem Dergachev via cfe-dev пишет:
>
> Hello,
>
> I'm not aware of any unfinished work on union support. I'd gladly give a few
> hints.
>
> RegionStore itself already operates in a union-friendly manner: even though
> it presents itself as a MemRegion->SVal map, internally it only stores base
> regions and byte offsets as keys, regardless of how the bytes stored are
> interpreted.
>
> However, i suspect bigger problems with actually representing symbolic
> values for fields that are interpreted in multiple ways during the analysis.
> First of all, reinterpret-casts of basic types are in general modeled
> poorly. For example, cast from pointer to integer is represented as a
> special LocAsInteger SVal sub-class, support for which in SValBuilder is
> worse than for symbols; at the same time cast from integer to pointer is not
> represented at all, as i mentioned recently in
> http://lists.llvm.org/pipermail/cfe-dev/2017-February/052769.html .
>
> Then, unions are even more fun because they challenge the concept of
> SymbolRegionValue, which is a very basic thing. Consider an example:
>
>   union U { intptr_t x; void *p; }
>
>   void foo(union U u) {
>     intptr_t X = u.x; // reg_$0<u.x>
>     void *P = u.p; // &SymRegion{reg_$1<u.p>}
>     if (X == P) { ... }
>   }
>
> The symbolic values for X and P look completely different, the worst thing
> here probably being that X is NonLoc while P is Loc. Yet, they represent the
> same thing.

They do not represent the same thing -- they designate the same region
of storage, but the underlying objects are not "the same" because they
have independent lifetimes. A union member within its lifetime is
"active", and the static analyzer would need to track which member of
the union is active to be able to reason about the program
appropriately.

~Aaron

> How would we unify these values? If you come up with a single
> symbol to represent them, what would be the type of that symbol? Or would
> you prefer to constrain these symbols to be equal, and rely on the
> constraint solver to figure things out?
>
> So making a proper union support is going to be a fun research project.
> Smaller incremental improvements that fix particular cases you encounter are
> also very much welcome. There's not necessarily much thought behind the
> safety-first bail-outs, so if you are seeing significant improvements by
> removing them, we could consider lifting them; it may also turn out that
> they're necessary, and an experienced eye may sometimes understand why, but
> as long as there is no comprehensive explanation in comments and no testcase
> coverage, i'd rather encourage you to remove them and see what happens;
> especially considering that you've got access to a good codebase to test
> these changes upon.
>
>
> 02/03/2017 1:53 AM, Keno Fischer via cfe-dev wrote:
>
> I was having some trouble with using the static analyzer on code that
> involves unions. Looking through the code, it seems like this is a known
> problem and unions aren't particularly well supported. Personally, I was
> able to make some progress by just ripping out all the checks for union
> types, particularly in RegionStore. However, that's obviously a hacky
> solution, so to ensure that my check will keep working, I'd like to improve
> upstream support for unions if possible. Has anybody thought about this
> problem before/is there already a design for this? Alternatively, have
> people collected test cases that would benefit from improving union support?
> I'm sure I'm not the first person to hit this issue, so I'd appreciate any
> pointers to prior work.
>
>
> _______________________________________________ cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>



More information about the cfe-dev mailing list