<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/143908>143908</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            excessive stack usage with -fsanitze=kernel-address
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          arndb
      </td>
    </tr>
</table>

<pre>
    While going through the remaining cases of excessive stack usage with ASAN and MSAN
that I hit with the Linux kernel, I came across one that was particularly surprising in how
it uses multiple kilobytes of stack with clang but not with gcc.

I have reduced it to a test case that uses 760 bytes with clang, but grows if similar code
is added into the inner loop:

```
enum omap_plane_id {
        OMAP_DSS_GFX, OMAP_DSS_VIDEO1, OMAP_DSS_VIDEO2, OMAP_DSS_VIDEO3, OMAP_DSS_WB,
};
static inline unsigned short DISPC_OVL_BASE(enum omap_plane_id plane)
{
 switch (plane) {
        case OMAP_DSS_GFX:    return 0x0080;
        case OMAP_DSS_VIDEO1: return 0x00BC;
        case OMAP_DSS_VIDEO2: return 0x014C;
        case OMAP_DSS_VIDEO3: return 0x0300;
        case OMAP_DSS_WB:     return 0x0500;
        default:              return 0;
 }
}
struct dispc_device {
        unsigned ctx[0x4000];
};
void dispc_write_reg(unsigned idx, unsigned val);
void dispc_runtime_resume(struct dispc_device *dispc, int has_feature)
{
 for (int i = 0; i < 5; i++) {
                for (int j = 0; j < 8; j++) {
                        unsigned short tmp = DISPC_OVL_BASE(i);
 dispc_write_reg(0, dispc->ctx[tmp + j]);
                }
 for (int j = 0; j < 8; j++) {
                        unsigned short tmp = DISPC_OVL_BASE(i) + 0x800;
 dispc_write_reg(0, dispc->ctx[tmp + j]);
                }
 }
}
```

For a reproducer, please see https://godbolt.org/z/1E55chKa3. From what I can
tell, all the loops are unrolled here, the calculation of each unrolled 'tmp + j' gets
inexplicably moved to the beginning of the loop and then spilled to the stack.
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzEVl1v4ygU_TXk5aoRwXHiPvjBaZpVtfMlVZrZtwjDjc0MBgtwks6vX4HTpE2r6ctKa0UKBs7h3A_fC_deNQaxJPmK5OsJH0JrXcmdkfWktvKp_NEqjdBYZRoIrbND00JoERx2XJk4K7hHD3YHeBTovdoj-MDFLxg8bxAOKrRQPVZfgBsJnx-rL4RWoeUBHqBVYVyPjJ-UGY7wC51BTdgdPIDgHQIXznoP1iAk1IF76LkLSgyaO_0EfnC9Uz5KUQZaeyC0UgGGqKobdFC9RviltK2fwih0lJcOFpqbBuohgLEnLY0QU0IrQqsHaPk-mioHgRJUgGCBQ0AfktWjoHTQckFh5L_QRiMic-PswYPagVed0tyBsBKjSA9cykhsgk0uUMagA21tT7Jq1EAW9PSjFZqhA9vxfttrbnCrJJDlitAKTs_Xz9W37frxcfvX5p94-vn9-8P6_uvs7RR7O5W9mvqxIuwuyliuSRaP8oEHJUAZrQzCYFL-SPCtdQHWD4_f7rZfv3_arqrHe8KKdxSnAWG3iTSJ9wcVRAuEFc9rV2YlX7-yLUtrDsPgDNAjpQUd5b0POZmfVS8hq7sPIew1ZDb_GJK9hmT0j8J-rE6mvIDk1xCJOz7o8Lzz_DxDTrtjiMZAxSi5QQSQyvdiK3GvBF759Bw5EY4kX9HjnFJK8lOUz-HeWyVPNAenAm4dNoQVZ7SSx5gv5_c91zG211g3mKC6iPZDh4QV7wpkVXqPhMoEaLnf7pCHwb3Ol511MVniFgUkWycPpOEd5GlI2Cr9rhPp-XnB8PPC8DMxFGn4EcMbN44fQOj6xPfmQ1Bnr7zjTRotTtM3JLsfA5KYWFSSry_Yq2cM9v9kTpJHj8U5Xf97w16m9MtCSGi1sQ44OOydjdXZxZN6jfHr8ojQhtD7WETZhrBNY2VtdZha1xC2-U3YZnaf56L9m2dT2DjbwWFsSIKb2J1Qpw7EtU5lORZkD9zFeues1iihxZiUd2lZcB17UVDWpDbIRXvZSNjyYjFbQoPBx9pv8NhrJXitn6Cze5Rw6gE1Nsqkxmp359NT7wwtGvC9Sryn3amVTSeyzORtdssnWM6W89tlVixmxaQta1nQPCuQi3ohM8yz5ayYz5b1AhdCFPViokpGWU4XMzabZ3RWTJeLnZjJvCiWO0F3YkHmNHZ6PdV630UHTpT3A5azeXZLi4nmNWqfrg-MGTxAWiWMxduEKyPoph4aT-ZUKx_8hSaooLH8w53hZue5UeE3kmw93gpuuJQOvZ8MTpdXEVahHeqpsB1hm3jG6e-md_YnikDYJinzMfaj9H3J_g0AAP__V263cg">