[cfe-dev] [LLVMdev] ubsan - active member check for unions

Jonathan 'Rynn' Sauer jonathan.sauer at gmx.de
Thu Dec 18 10:30:49 PST 2014


Hello,

> On 12/15/2014 10:24 PM, Ismail Pazarbasi wrote:
>>     s.d = 42.0;
>>     if (s.l > 100) // fire here
> 
> Note that code like this is frequently used to convert integers to floats so you'll get tons of false positives.

I think the active member check should be used for C++ only, not C, as in C it's not undefined behavior,
merely implementation-defined:

N5171 §6.5.2.3 footnote 95 (C-11):

| If the member used to read the contents of a union object is not the same as the member last used to store a
| value in the object, the appropriate part of the object representation of the value is reinterpreted as an
| object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’).

Whereas in C++ it's undefined behavior (can't seem to find the clause in the C++ standard right now) -- although
there is an exception in C++ when it comes to fields of the same type, which OP mentioned:

| "special guarantee" that allows inspecting common initial sequence mentioned.

N4141 §9.5p1 (C++14):

| One special guarantee is made in order to simplify the use of unions: If a standard-layout union contains
| several standard-layout structs that share a common initial sequence (9.2), and if an object of this
| standard-layout union type contains one of the standard-layout structs, it is permitted to inspect the common
| initial sequence of any of standard-layout struct members;

And §9.2p18:

| If a standard-layout union contains two or more standard-layout structs that share a common initial sequence,
| and if the standard-layout union object currently contains one of these standard-layout structs, it is permitted
| to inspect the common initial part of any of them. Two standard-layout structs share a common initial sequence
| if corresponding members have layout-compatible types and either neither member is a bit-field or both are
| bit-fields with the same width for a sequence of one or more initial members.

In C++ converting between float and int should be done using std::memcpy, which a current compiler will turn into
a single move, c.f. <http://blog.regehr.org/archives/959>.

>> 2. Where can I store type and field info about the union; some form of
>> a shadow memory or a simple array/map?
> 
> Without shadow it may be unacceptably slow in union-intensive applications. But with shadow, it'll greatly complicate UBSan.

It also shouldn't clash with ASan's shadow memory.


Jonathan





More information about the cfe-dev mailing list