[llvm-dev] broken C code only when optimized "-O2"

Adrian Moreno via llvm-dev llvm-dev at lists.llvm.org
Wed Dec 22 07:07:28 PST 2021



On 12/22/21 14:01, Jessica Clarke wrote:
> On 22 Dec 2021, at 12:10, Dimitry Andric via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>
>> On 22 Dec 2021, at 08:33, Adrian Moreno <amorenoz at redhat.com> wrote:
>>>
>>> On 12/21/21 19:25, Dimitry Andric wrote:
>>>> On 21 Dec 2021, at 17:30, Adrian Moreno via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>>>>
>>>>> I need some help understanding what might be wrong with a piece of code from the openvswitch project. By ${subject} I'm not suggesting there's a problem in clang, gcc also shows the same behavior so it's likely our code is broken. I am kindly asking for help to understand/troubleshoot the problem.
>> ...
>>>> % ./example
>>>> start: 0x16ee6f618
>>>> example.c:78:5: runtime error: applying zero offset to null pointer
>>>> SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior example.c:78:5 in
>>>> example.c:78:5: runtime error: applying zero offset to null pointer
>>>> SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior example.c:78:5 in
>>>> first: 0x600003200270
>>>> last: 0x6000032002a0
>>>> It looks like the HMAP_FOR_EACH() macro uses null pointer arithmetic. The problem appears to be in https://github.com/openvswitch/ovs/blob/master/include/openvswitch/util.h#L94 :
>>>> /* Given OBJECT of type pointer-to-structure, expands to the offset of MEMBER
>>>> * within an instance of the structure.
>>>> *
>>>> * The GCC-specific version avoids the technicality of undefined behavior if
>>>> * OBJECT is null, invalid, or not yet initialized.  This makes some static
>>>> * checkers (like Coverity) happier.  But the non-GCC version does not actually
>>>> * dereference any pointer, so it would be surprising for it to cause any
>>>> * problems in practice.
>>>> */
>>>> #ifdef __GNUC__
>>>> #define OBJECT_OFFSETOF(OBJECT, MEMBER) offsetof(typeof(*(OBJECT)), MEMBER)
>>>> #else
>>>> #define OBJECT_OFFSETOF(OBJECT, MEMBER) \
>>>>     ((char *) &(OBJECT)->MEMBER - (char *) (OBJECT))
>>>> #endif
>>>> The comment is incorrect here, because dereferencing a null pointer, as done in *(OBJECT), is definitely undefined behavior.
>>>
>>> Thanks Dimitry for spotting it.
>>>
>>> So you were running in a non GNUC system? I was testing on Fedora so we were using different implementations of this macro. I will raise it with the openvswitch community that the non GNUC is being flagged by UBSan as an error.
>>
>> Yes, I have run this on macOS 12.1 with Apple clang-1300.0.29.30, and on FreeBSD 14-CURRENT, with clang 13.0.0. In both cases, turning optimization on or off does not make a difference for the UBSan reports. I only added -g to be able to debug.
>>
>> Note also that neither on macOS nor on FreeBSD do I ever get the program to print "list is broken!". It does not matter which optimization level I choose. So here you see that the system's memory layout will likely have an influence on the result.
>>
>> Since you told us that adding LTO flags makes the problem appear, maybe this is related? I know that on FreeBSD, none of the libraries you link in are stored in LLVM bitcode (yet, at least:), and I suspect that this is also the case on macOS.
>>
>> I attached another version of the example, with the external functions from openvswitch inlined, so it should only depend on system headers, and won't need any external libraries. Maybe this makes it easier to reason about the problem. Again, on macOS and FreeBSD this shows UBSan messages, but never "list is broken!".
>>

Thanks very much Dimitry. With your inlined example clang does work but gcc 
still fails. So clearly the linking does influence.	

>>
>>> For the _GNUC_ version, is using *(OBJECT) inside the "typeof" operand also undefined behavior?
>>
>> (Note that clang also defines __GNUC__ so this block will be used for clang too.)
>>
>> If OBJECT ever becomes a null pointer, as in the example, then it is undefined.
>>
>> And I don't think an argument "I'm only doing a typeof(*nullptr), not really accessing it" works... :)
> 
> typeof(*NULL) should be fine, just like sizeof(*NULL), it’s not a “real” dereference.
> 
> At least, -Wnull-dereference isn’t emitted for either.
> 
> Jess
> 

After manually expanding the macros and replacing all the typeof(*nullptr) with 
the actual type, the behavior of both inlined and statically linked versions 
remains the same.

What UBSan is flagging is the use of:
((void*)0)) - __builtin_offsetof(struct member, hmap_node))))

So we clearly need to change that to avoid the potential undefined behavior

Also tried a patch based on the theory I've posted in the other subthread (that 
one possible issue is that I'm using a "struct ovs_list" allocated in the stack 
and treating them as if it was embedded inside a "struct member", which 
generates potential accesses to arbitrary stack addresses). The patch below 
fixes both clang and gcc in all optimization levels.

But I wouldn't like to stop the investigation until I understand all the 
potential problems (these macros and data structures are wildly used in 
openvswitch).

--- example-inlined.c	2021-12-22 16:05:12.584459052 +0100
+++ example-inlined-patched.c	2021-12-22 15:55:21.168212637 +0100
@@ -318,7 +320,9 @@
      //struct ovs_list sstart;
      //struct ovs_list start = &start;

-    struct ovs_list start = {0};
+    // Using a member to hold the ovs_list start
+    struct member start_member = {0};
+    struct ovs_list *start = &start_member.elem;

      bond = malloc(sizeof (struct bond));
      memset(bond, 0, sizeof (struct bond));
@@ -334,14 +338,14 @@
      hmap_insert(&bond->members, &member1->hmap_node, 
(uint32_t)(uintptr_t)member1);
      hmap_insert(&bond->members, &member2->hmap_node, 
(uint32_t)(uintptr_t)member2);

-    printf("start: %p\n", &start);
-    ovs_list_init(&start);
+    printf("start: %p\n", start);
+    ovs_list_init(start);
      HMAP_FOR_EACH (member, hmap_node, &bond->members) {
         /*
          * Insert member in start (sorted)
          * */
         struct member *pos;
-       LIST_FOR_EACH (pos, elem, &start) {
+       LIST_FOR_EACH (pos, elem, start) {
             if (member->order > pos->order) {
                 break;
             }
@@ -351,13 +355,14 @@
         ovs_list_insert(&pos->elem, &member->elem);
      }

+
      // TESTED: If, instead of using an hmap to iterate through bond->members, 
I add them
      // one by one, the problem disappears:
      //insert_ordered(&start, member1);
      //insert_ordered(&start, member2);

-    struct member *first = CONTAINER_OF(ovs_list_front(&start), struct member, 
elem);
-    struct member *last = CONTAINER_OF(ovs_list_back(&start), struct member, elem);
+    struct member *first = CONTAINER_OF(ovs_list_front(start), struct member, 
elem);
+    struct member *last = CONTAINER_OF(ovs_list_back(start), struct member, elem);

      printf("first: %p \nlast: %p\n", first, last);
      /* I've inserted two members into the 'start' list.
@@ -366,7 +371,7 @@
      if ((first != member1 && first != member2) || (last != member1 && last != 
member2)) {
          printf("list is broken!\n");
          printf("Start: %p. start->next: %p, start->next->next: %p, 
start->prev: %p\n",
-               &start, start.next, start.next->next, start.prev);
+               start, start->next, start->next->next, start->prev);
          ret = 1;
          goto out;


-- 
Adrián Moreno



More information about the llvm-dev mailing list