<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/61592>61592</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Question: How does clang assume alignment for extern C arrays?
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          knzivid
      </td>
    </tr>
</table>

<pre>
    I am trying to investigate why I hit a segmentation fault in C++. I am pretty certain this is a bug in my code. But I wanted to know why clang behaves differently compared to GCC here. I am a beginner to reading assembly, please correct me if I made any false assumptions :)

Consider the snippet (https://www.godbolt.org/z/PKroaEKoo)

```cpp
#include <cstdint>
#include <string>

using T = unsigned char;

template<size_t n>
int asString(const T (&t)[n])
{
 return std::string(t, t + n).at(0);
}

extern T data[14717];

int main()
{
    return asString(data);
}
```

Here, `data` comes from an object file generated through a binutils derivative. The alignment of data is not explicitly specified.

```
$ readelf --sections --symbols data.o --wide
There are 5 section headers, starting at offset 0x3a70:

Section Headers:
  [Nr] Name              Type Address          Off    Size   ES Flg Lk Inf Al
  [ 0] NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .data             PROGBITS        0000000000000000 000040 003981 00  WA  0 0  1
  [ 2] .symtab           SYMTAB          0000000000000000 0039c8 000048 18      3   1  8
  [ 3] .strtab           STRTAB 0000000000000000 003a10 00003d 00      0   0  1
  [ 4] .shstrtab STRTAB          0000000000000000 003a4d 000021 00      0   0 1
```

When I look at the generated assembly, this is unrolled into movups. In the internal binary I am investigating, I do see a mix of movaps and movups where the string constructor is called. The extern C array is copied using movaps here to xmm.

```
0x00215090      movaps   xmm0, xmmword [rcx + rsi - 0x50]
0x00215095      movaps   xmm1,   xmmword [rcx + rsi - 0x40]
0x0021509a      movups   xmmword [rax + rsi - 0x50], xmm0
0x0021509f      movups   xmmword [rax + rsi - 0x40], xmm1
0x002150a4      movaps   xmm0,   xmmword [rcx + rsi - 0x30]
0x002150a9      movaps   xmm1,   xmmword [rcx + rsi - 0x20]
0x002150ae      movups   xmmword [rax + rsi - 0x30], xmm0
0x002150b3      movups   xmmword [rax + rsi - 0x20], xmm1
0x002150b8      movaps   xmm0,   xmmword [rcx + rsi - 0x10]
0x002150bd      movaps   xmm1,   xmmword [rcx + rsi]
0x002150c1 movups   xmmword [rax + rsi - 0x10],  xmm0
0x002150c6      movups xmmword [rax + rsi],  xmm1
0x002150ca      add      rsi, 0x60
0x002150ce      cmp      rsi,    0x39b0                     ; case.0x553b30.2 ; case.0x553b30.2 ; case.0x553b30.2
0x002150d5      jne 0x215090                           ; likely       
```

**Question 1**: How does clang know to choose movaps vs movups? According to [this SO answer](https://stackoverflow.com/a/61197816), "the x86-64 System V ABI requires that static arrays of 16 bytes or larger be aligned by 16". Using [movaps](https://www.felixcloutier.com/x86/movaps) with an XMM register also requires 16 byte alignment. Are these the only constraints?

**Question 2**: Even though the minimal example uses movups, my internal binary uses a mix of movups and movaps. Do you have any suggestions on how to go about debugging this? Both my internal binary and godbolt are essentially the same code.

**Question 3**: In the godbolt example, changing `using T = char` defers to a memcpy. Does this mean marking the extern C array as `unsigned char` produce better code or is this influenced by something else? I understand "better" here is subjective, and I want to understand the reasons behind the difference in generated code.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJycWFtzozgW_jXKy6m4QBhfHvxgJ-0Z13RPz04yO7tPWwIOoImQWEn40r9-6wjcJo47lR3KMRXQ-c79ZuGcrDTiiqUblj7eic7Xxq5e9De5l8VdZorTageiAW9PUlfgDUi9R-dlJTzCoT7BDmrpQYDDqkHthZdGQyk65UFqeGB8w_hmAgGltej9CXK0XkgNvpYOpAMBWVfR6eYEuSlwApvOww4OQnssiOmLNofALVdCV5BhLfbooJBliRa1V0TYtML2x396eIAaLQ5sBWRYSa3R0kuLoiBdhHPYZOrE-AO0CoVDyI21mHtoEGQJO2hEgSD0CUqhHBJF17SkoAOWrBlfsuiRRev--8FoJwviUSM4LdsWPTC-qL1vXTi-ZXx7OBwmlSkyo_zE2Irx7TfGt7_9Yo349IsxV5hsFvWfvG2HJzyROlddgcCSh9z5QmrPkk-33jpvpa4uL8N350j5Z2DJI3Q6uL-AvBaWJZvxOY9Nq4RHwpHf8D8e9HckqT0I99TD80VutPMEyReMzzzpkG40Sx8v2swHbLDoO6vB-YJMkqzdGcSTH8hgG9CMLyfCM76ICOG7XPPHsYB49Gg1PEMhvGDpJp7O4zkxfa0HydoIqYNwb8QBOEs00icA3mR8dseYwc9okWRnsygQziKKRXRQWtOA0GCyvyioSqkQKtRoRYjq2pquqik4pe68VA4KtHIvvNzjBJ5rBKFkpSmpwJRBS0oWbTzgsVUylxT2rsVclhKLyc24OYfFNIQ9qhLu7x3mfQzf37tTkxniLLyYGLi_P8gCe5pnSiAQFiGFgQJqwrCOtHVeWB_SiKQrHXqIjomYR-TWkSRPA-nPA-n5LQBLN79alj7Cr6JBeHU9n1qEdVFYdO7y9GtZ0u1JfqPjn55gqyr4_AI7XcJajWAhCrB_fP48Bo2uLri-Dcf6v2iMFxPeJHhgfP32-9efNrvnp3c5TOmWLBdx4PDnmtAjgHiMzwO-OzVeZCP8p39_eV5v3tUgWeaLns0C4kV_LAGAGGAx5pD0HLy94vD8O3G4BSziXv6kuDbNK9GnPXA9QA-A74ospkV4zuMr5PidLPuzRg07UMa8UMhRjb3k0riUn5tKp61RCguQ2htozL5r3QR2OpBKTbVDKMo9YU99m7h0tlAGHmAHhQGHCAIaeaQkbMxetA6ELgZEOIQ0CSU_lA8IxdB2uTeWxMgFCdHn81CxHkBYK07hrWklFtDX5AG8BzRwbJr3cjo6kgXTaDmYcKAGootI-mPTHIwtyEs2P4bCap2Ee4iOKSXIFUz6FiYmGHgPaHoDSHwH6s5A38nFLTl6Wa_VKj8OMx3BxK9hxPQH1nlXreStWmL5d-zDbwDhxxVLfmyfLPk4DP-xfbLF37FP_FatrPg_7fMGIY8_pEx8VuaGUfLZK6PcRhmRXxkjH0JXFIMydJo_QHS8Trp8cGLetK9OUhk7JsssglsXSzaQC4eT6JimSZZEE_7RZ6_ZF0Oy_qWR3DuqAj_kq-QLqtP5wY_rLOP0-UdHldBoiPv_WbKGn80BCoNumMHDSO4N5LUxDs9-37vB-CzZwjrPjS2GvYGlm1Can76C0O6ANvjhajp2XuQvZo-2VOYwyU3D-FYwvp3F8XK-iGc0ldGoxTlV3ONidj-bwtPJeWzgn7De7MDifztp0YGvhacpxcu8r7eOCng8g-zk0YGxoISt0EI2DFpYQHYC4sEn8EeoyCzd9HrdkpUm-RKVPObKdF6iHeQ9LmaMbwc6voSD9DWNgf_68gUsVtJ5tCCUMxdZB6kuA98E1n1XcX1vMTpsONRZhNSerPuO0_jFaZ_2SB0vTJoE1EgtG6EAj6JpFULn8LvH-AMtYNetMZwYN8Du0gAFtdRHAyfTAW1kYVdyXVX1gjigmbEPk8qAyEznocCsq6oQFLUMYbIxvr7FmZgMq1KYQ9E51F4KpU59w6W5MeyL79giudhi6P1nyMEEpHZeCx1EYrNovB-FtWgWQYElWkdqCGiwydsTqR2CTDpoUGhohH3plXrT54ULuK92rVkErTVFlyNk6CkiSBHoZ4Z-hNGl6lDnfVg606CviQEqh2S1HXSaJmpPZmKc9zCM836AkA5cFxYPuQ860rF-pSY9RrQksUXhyF8Z1nJ4dN6tc5qWRrNWMPhdsUqKZbIUd7iKZ_PllPN5nN7VK1yKLJvHOM_ivMTFDKdplC-j5Syal2kxTe7kikc8iRIex4t4kaSTeDkv09lyHmUYR3Ees2mEjZBqotS-oRX5TjrX4WoWp0t-p0SGyoWfKzjXeIDwknGaoO_simjus65ybBop6by7oHjpFa7OcXGjoIXtfrx0lcZeuZLC9a6zavW6ElTS1102pD8xHG73rTXkAca3QUwXKlm65P8LAAD__1p1Lkc">