[cfe-dev] Unions in the Clang Static Analyzer
Alexey Sidorin via cfe-dev
cfe-dev at lists.llvm.org
Thu Mar 2 22:16:26 PST 2017
Hi Aaron!
Thank you for response. I'm curious if such cases like described:
union U { intptr_t x; void *p; }
void foo(union U u) {
intptr_t X = u.x; // reg_$0<u.x>
void *P = u.p; // &SymRegion{reg_$1<u.p>}
if (X == P) { ... }
}
have defined behavior or not. I have seen many discussions on this but I
haven't seen any strict rule on this.
03.03.2017 08:49, Aaron Ballman пишет:
> On Thu, Mar 2, 2017 at 1:11 PM, Alexey Sidorin <alexey.v.sidorin at ya.ru> wrote:
>> Hi all,
>>
>> There is one thing that can make the implementation a bit easier :)
>> The reading of a union field that was not stored last may cause undefined
>> behavior. A discussion on this issue can be found on
>> https://www.securecoding.cert.org/confluence/display/c/EXP39-C.+Do+not+access+a+variable+through+a+pointer+of+an+incompatible+type
>>
>> I guess we need some language lawyers here. Aaron, can I disturb you?
> Certainly -- is there advice in particular you're looking for? I'm not
> overly familiar with the static analyzer design, but I'm to help as I
> can.
>
> A note below...
>
>>
>> 02.03.2017 20:12, Artem Dergachev via cfe-dev пишет:
>>
>> Hello,
>>
>> I'm not aware of any unfinished work on union support. I'd gladly give a few
>> hints.
>>
>> RegionStore itself already operates in a union-friendly manner: even though
>> it presents itself as a MemRegion->SVal map, internally it only stores base
>> regions and byte offsets as keys, regardless of how the bytes stored are
>> interpreted.
>>
>> However, i suspect bigger problems with actually representing symbolic
>> values for fields that are interpreted in multiple ways during the analysis.
>> First of all, reinterpret-casts of basic types are in general modeled
>> poorly. For example, cast from pointer to integer is represented as a
>> special LocAsInteger SVal sub-class, support for which in SValBuilder is
>> worse than for symbols; at the same time cast from integer to pointer is not
>> represented at all, as i mentioned recently in
>> http://lists.llvm.org/pipermail/cfe-dev/2017-February/052769.html .
>>
>> Then, unions are even more fun because they challenge the concept of
>> SymbolRegionValue, which is a very basic thing. Consider an example:
>>
>> union U { intptr_t x; void *p; }
>>
>> void foo(union U u) {
>> intptr_t X = u.x; // reg_$0<u.x>
>> void *P = u.p; // &SymRegion{reg_$1<u.p>}
>> if (X == P) { ... }
>> }
>>
>> The symbolic values for X and P look completely different, the worst thing
>> here probably being that X is NonLoc while P is Loc. Yet, they represent the
>> same thing.
> They do not represent the same thing -- they designate the same region
> of storage, but the underlying objects are not "the same" because they
> have independent lifetimes. A union member within its lifetime is
> "active", and the static analyzer would need to track which member of
> the union is active to be able to reason about the program
> appropriately.
>
> ~Aaron
>
>> How would we unify these values? If you come up with a single
>> symbol to represent them, what would be the type of that symbol? Or would
>> you prefer to constrain these symbols to be equal, and rely on the
>> constraint solver to figure things out?
>>
>> So making a proper union support is going to be a fun research project.
>> Smaller incremental improvements that fix particular cases you encounter are
>> also very much welcome. There's not necessarily much thought behind the
>> safety-first bail-outs, so if you are seeing significant improvements by
>> removing them, we could consider lifting them; it may also turn out that
>> they're necessary, and an experienced eye may sometimes understand why, but
>> as long as there is no comprehensive explanation in comments and no testcase
>> coverage, i'd rather encourage you to remove them and see what happens;
>> especially considering that you've got access to a good codebase to test
>> these changes upon.
>>
>>
>> 02/03/2017 1:53 AM, Keno Fischer via cfe-dev wrote:
>>
>> I was having some trouble with using the static analyzer on code that
>> involves unions. Looking through the code, it seems like this is a known
>> problem and unions aren't particularly well supported. Personally, I was
>> able to make some progress by just ripping out all the checks for union
>> types, particularly in RegionStore. However, that's obviously a hacky
>> solution, so to ensure that my check will keep working, I'd like to improve
>> upstream support for unions if possible. Has anybody thought about this
>> problem before/is there already a design for this? Alternatively, have
>> people collected test cases that would benefit from improving union support?
>> I'm sure I'm not the first person to hit this issue, so I'd appreciate any
>> pointers to prior work.
>>
>>
>> _______________________________________________ cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>>
>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>>
More information about the cfe-dev
mailing list