[llvm-dev] broken C code only when optimized "-O2"

Adrian Moreno via llvm-dev llvm-dev at lists.llvm.org
Tue Dec 21 23:33:52 PST 2021



On 12/21/21 19:25, Dimitry Andric wrote:
> On 21 Dec 2021, at 17:30, Adrian Moreno via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>
>> I need some help understanding what might be wrong with a piece of code from the openvswitch project. By ${subject} I'm not suggesting there's a problem in clang, gcc also shows the same behavior so it's likely our code is broken. I am kindly asking for help to understand/troubleshoot the problem.
>>
>> Summary: It seems that certain interaction between two main openvswitch data structures, when optimized ("-O2 -flto=auto") is broken.
>> The two data structures are:
>>
>> hmap: https://github.com/openvswitch/ovs/blob/master/include/openvswitch/hmap.h
>> list: https://github.com/openvswitch/ovs/blob/master/include/openvswitch/list.h
>>
>> I've reproduced the problem outside of openvswitch daemon using a short C program (attached)
>>
>> Code snippet:
>>
>> struct bond {
>>     struct hmap members;
>> };
>>
>> struct member {
>>     struct hmap_node hmap_node;
>>     int order;
>>     struct ovs_list elem;
>> };
>>
>> int main() {
>>     int ret = 0;
>>     struct member *member, *member1, *member2;
>>     struct bond *bond;
>>     struct ovs_list start = {0};
>>
>>     bond = malloc(sizeof *bond);
>>     memset(bond, 0, sizeof (struct bond));
>>     hmap_init(&bond->members);
>>
>>     member1 = malloc(sizeof *member1);
>>     member2 = malloc(sizeof *member2);
>>     memset(member1, 0, sizeof (struct member));
>>     memset(member2, 0, sizeof (struct member));
>>
>>     member1->order = 3;
>>     member2->order = 2;
>>
>>     hmap_insert(&bond->members, &member1->hmap_node, (uint32_t)(uintptr_t)member1);
>>     hmap_insert(&bond->members, &member2->hmap_node, (uint32_t)(uintptr_t)member2);
>>
>>     ovs_list_init(&start);
>>     HMAP_FOR_EACH (member, hmap_node, &bond->members) {
>>        /*
>>         * Insert member in start (sorted)
>>         * */
>>        struct member *pos;
>>        LIST_FOR_EACH (pos, elem, &start) {
>>            if (member->order > pos->order) {
>>                break;
>>            }
>>        }
>>        // TESTED: If I add this printf, the problem disappears
>>        //printf("Inserting member: %p\n", member);
>>        ovs_list_insert(&pos->elem, &member->elem);
>>     }
>>
>>     /* I've inserted two members into the 'start' list.
>>      *   first and last have to be either member1 or member2
>>      * */
>>     if ((first != member1 && first != member2) || (last != member1 && last != member2)) {
>>         printf("list is broken!\n");
>>     }
>>
>> }
>>
>>
>> What I know for now:
>> * -fno-strict-aliasing does not fix it
>> * Only happens with "-O2 -flto=auto"
>> * If I define 'ovs_list *start' and change the code to use the pointer directly and not '&start' the problem disappears. It seems that the LIST_FOR_EACH macros prefer an lvalue rather than "&" but I don't get why.
>> * I'm not able to reproduce without using hmap _and_ ovs_list.
>> * If I add a compiler barrier (or a call to an external function) after the loop, the problem disappears (e.g printf), the problem disappears.
>> * If I add -fsanitize=undefined the problem disappears!
> 
> Not for me:
> 
> % clang -g -fsanitize=undefined -I/Users/dim/tmp/vswitch/foo/include example.c -o example -L/Users/dim/tmp/vswitch/foo/lib -lopenvswitch
> 

Thanks for running it!

> % ./example
> start: 0x16ee6f618
> example.c:78:5: runtime error: applying zero offset to null pointer
> SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior example.c:78:5 in
> example.c:78:5: runtime error: applying zero offset to null pointer
> SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior example.c:78:5 in
> first: 0x600003200270
> last: 0x6000032002a0
> 
> It looks like the HMAP_FOR_EACH() macro uses null pointer arithmetic. The problem appears to be in https://github.com/openvswitch/ovs/blob/master/include/openvswitch/util.h#L94 :
> 
> /* Given OBJECT of type pointer-to-structure, expands to the offset of MEMBER
>   * within an instance of the structure.
>   *
>   * The GCC-specific version avoids the technicality of undefined behavior if
>   * OBJECT is null, invalid, or not yet initialized.  This makes some static
>   * checkers (like Coverity) happier.  But the non-GCC version does not actually
>   * dereference any pointer, so it would be surprising for it to cause any
>   * problems in practice.
>   */
> #ifdef __GNUC__
> #define OBJECT_OFFSETOF(OBJECT, MEMBER) offsetof(typeof(*(OBJECT)), MEMBER)
> #else
> #define OBJECT_OFFSETOF(OBJECT, MEMBER) \
>      ((char *) &(OBJECT)->MEMBER - (char *) (OBJECT))
> #endif
> 
> The comment is incorrect here, because dereferencing a null pointer, as done in *(OBJECT), is definitely undefined behavior.
> 

Thanks Dimitry for spotting it.

So you were running in a non GNUC system? I was testing on Fedora so we were 
using different implementations of this macro. I will raise it with the 
openvswitch community that the non GNUC is being flagged by UBSan as an error.

For the _GNUC_ version, is using *(OBJECT) inside the "typeof" operand also 
undefined behavior?


> -Dimitry
> 

-- 
Adrián Moreno



More information about the llvm-dev mailing list