<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - __int128 doesn't work properly on ARM64 with __atomic_compare_exchange_n() and __sync_bool_compare_and_swap()"
   href="https://bugs.llvm.org/show_bug.cgi?id=38621">38621</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>__int128 doesn't work properly on ARM64 with __atomic_compare_exchange_n() and __sync_bool_compare_and_swap()
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>6.0
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>Other
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: AArch64
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>jsquyres@cisco.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Greetings llvm folks; long time listener, first time caller.

I'm posting on behalf of the Open MPI project (www.open-mpi.org).  We just ran
across what might be a bug in LLVM (clang 6) on ARM64: __int128 operands for
__atomic_compare_exchange_n() and __sync_bool_compare_and_swap() don't behave
like they do on clang 6/x86_64, gcc 7.3/x86_64, or gcc 7.3/ARM 64.

Specifically, these functions are described in
<a href="https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html">https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html</a> and
<a href="https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html">https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html</a>.

Here's tests that pass on GCC 7.3 and clang 6 on x86, but fail (differently)
with GCC 7.3 and clang 6 on ARM64:

```c
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
#include <stdlib.h>

typedef union {
    uint64_t fake[2];
    __int128 real;
} ompi128;

static void test1(void)
{
    ompi128 ptr      = { .fake = { 0xFFEEDDCCBBAA0099, 0x8877665544332211 }};
    ompi128 expected = { .fake = { 0x11EEDDCCBBAA0099, 0x88776655443322FF }};
    ompi128 desired  = { .fake = { 0x1122DDCCBBAA0099, 0x887766554433EEFF }};

    bool r = __atomic_compare_exchange_n (&ptr.real,
                                          &expected.real,
                                          desired.real,
                                          true,
                                          __ATOMIC_RELAXED,
                                          __ATOMIC_RELAXED);

    if (r == false && ptr.real == expected.real) {
        printf("Test 1 passed!\n");
    } else {
        printf("Test 1 failed\n");
    }
}

static void test2(void)
{
    ompi128 ptr      = { .fake = { 0xFFEEDDCCBBAA0099, 0x8877665544332211 }};
    ompi128 expected = ptr;
    ompi128 desired  = { .fake = { 0x1122DDCCBBAA0099, 0x887766554433EEFF }};

    bool r = __atomic_compare_exchange_n (&ptr.real,
                                          &expected.real,
                                          desired.real,
                                          true,
                                          __ATOMIC_RELAXED,
                                          __ATOMIC_RELAXED);

    if (r == true && ptr.real == desired.real) {
        printf("Test 2 passed!\n");
    } else {
        printf("Test 2 failed\n");
    }
}

static void test3(void)
{
    ompi128 ptr    = { .fake = { 0xFFEEDDCCBBAA0099, 0x8877665544332211 }};
    ompi128 oldval = { .fake = { 0x11EEDDCCBBAA0099, 0x88776655443322FF }};
    ompi128 newval = { .fake = { 0x1122DDCCBBAA0099, 0x887766554433EEFF }};

    bool r = __sync_bool_compare_and_swap(&ptr.real,
                                          oldval.real,
                                          newval.real);

    if (r == false && ptr.real != newval.real) {
        printf("Test 3 passed!\n");
    } else {
        printf("Test 3 failed\n");
    }
}

static void test4(void)
{
    ompi128 ptr    = { .fake = { 0xFFEEDDCCBBAA0099, 0x8877665544332211 }};
    ompi128 oldval = ptr;
    ompi128 newval = { .fake = { 0x1122DDCCBBAA0099, 0x887766554433EEFF }};

    bool r = __sync_bool_compare_and_swap(&ptr.real,
                                          oldval.real,
                                          newval.real);

    if (r == true && ptr.real == newval.real) {
        printf("Test 4 passed!\n");
    } else {
        printf("Test 4 failed\n");
    }
}

int main(int argc, char** argv[])
{
    test1();
    test2();
    test3();
    test4();

    return 0;
}
```

On Linux x86_64 with gcc 7.3, it passes:

```
$ gcc --version
gcc (GCC) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gcc int128.c -o int128 -latomic -mcx16
$ ./int128
Test 1 passed!
Test 2 passed!
Test 3 passed!
Test 4 passed!
```

On Linux x86_64 with clang 6, it passes:

```
$ clang --version
clang version 6.0.0 (tags/RELEASE_600/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /cm/shared/apps/clang/6.0.0/bin
$ clang int128.c -o int128 -mcx16
$ ./int128
Test 1 passed!
Test 2 passed!
Test 3 passed!
Test 4 passed!
```

FWIW: on Linux ARM64 with gcc 7.3, it fails to link (which seems to be
confirmed by the gcc developers: they only support up to 64 bit operands with
these functions on ARM 64):

```
%> gcc --version
gcc (GCC) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

%> /usr/local/gcc73/bin/gcc -D_GNU_SOURCE -O3 -std=c99 -mcpu=thunderx2t99
-march=armv8.1-a+lse -ffast-math -ffp-contract=fast -funroll-loops
-finline-functions -L/usr/local/gcc73/lib64 -latomic -Wl,-rpath
-Wl,/usr/local/gcc73/lib64 ompi2.c -o ompi2-gcc
/tmp/ccAu51lk.o: In function `main':
ompi2.c:(.text.startup+0x194): undefined reference to
`__sync_bool_compare_and_swap_16'
ompi2.c:(.text.startup+0x200): undefined reference to
`__sync_bool_compare_and_swap_16'
collect2: error: ld returned 1 exit status
%>
```

On Linux ARM64 with clang 6, it links successfully, but gives different answer
than on x86_64:

```
%> clang --version
clang version 6.0.1 (<a href="https://github.com/flang-compiler/flang-driver.git">https://github.com/flang-compiler/flang-driver.git</a>
ccd507ca383f2122b2c5510aac0f7393f440398a) (<a href="http://llvm.org/git/llvm.git">http://llvm.org/git/llvm.git</a>
057310b8fc41e5089dbee0d2c90e4233756a83c9)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/cavium/bin
%> /opt/cavium/bin/clang -D_GNU_SOURCE -O3 -std=c99 -mcpu=thunderx2t99
-march=armv8.1-a+lse -ffast-math -ffp-contract=fast -funroll-loops
-finline-functions -fslp-vectorize -fvectorize -Wl,-rpath -Wl,/opt/cavium/lib
ompi2.c -o ompi2-clang
%> ./ompi2-clang
Test 1 failed
Test 2 passed!
Test 3 failed
Test 4 passed!
%>

```

I realize that __int128 and these 2 functions are GCC extensions, and are not
standard in the language.

But it seems like clang is trying to behave the same way as these gcc-defined
extensions.  Specifically: Open MPI's configure script saw that both the
__int128 type existed and these 2 functions were linkable, so it concluded that
it could use them the same way that we use them on x86_64.  But then we get
different results on x86_64 vs. ARM64.

If possible, it would be great to have clang support these functions on ARM64. 
I recognize that that may be quite difficult, so if that's not possible, it
would be nice if these functions would fail to link on ARM64 so that configure
scripts can easily determine that they are not supported on ARM64.

Thank you!</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>