<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/112941>112941</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[AMDGPU] Numerical error related to load/store vectorization
</td>
</tr>
<tr>
<th>Labels</th>
<td>
backend:AMDGPU,
miscompilation
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
qedawkins
</td>
</tr>
</table>
<pre>
The following input example: https://gist.github.com/qedawkins/c620832f96a5c504295f9694cc8956e2
Which is approximately compiled from
```
// int8_t a[144]; // [2 x 8 x 3 x 3]
// int8_t b[144]; // [2 x 8 x 3 x 3]
// int c[2];
__global__ void helloworld(int8_t* a, int8_t* b, int* c)
{
int id = hipThreadIdx_x
if (id < 2) {
for (int c = 0; c < 8; ++c) {
for (int h = 0; h < 3; ++h) {
for (int w = 0; w < 3; ++w) {
int x = a[id * 72 + c * 9 + h * 3 + w];
int y = b[id * 72 + c * 9 + h * 3 + w];
c[id] += x * y;
}
}
}
}
}
```
When compiled with
```
llc -O3 module_test_dispatch_0_rocm_hsaco_fb.optimized.ll -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100
```
Produces incorrect numerics. For inputs
```
a = np.ones((2, 8, 3, 3), dtype=np.int8)
b = np.broadcast_to(np.array([1, 2, 1], dtype=np.int8), (2, 8, 3, 3))
```
In other words only the values of `b` vary along the inner most loop, this gives `c = [88, 88]` when the correct values should be `(1 + 2 + 1) * 3 * 8 = [96, 96]`
Compiling with
```
llc -O3 module_test_dispatch_0_rocm_hsaco_fb.optimized.ll -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -amdgpu-load-store-vectorizer=false
```
(turning off the load/store vectorizer) or just with `-O1` instead of `-O3` gives the correct values. I am not sure exactly how the generated code is miscompiling and don't know whether load/store vectorization is the root cause or if disabling it is just masking the problem. Further information about the downstream issue here: https://github.com/iree-org/iree/issues/18798
cc @MaheshRavishankar
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzEVs2O2zgSfhr6UrAhUZYtHXyw2_Aih2yCRRZzNCiyZDJNkRqS8k-efkBKabsTJ4eZwzTaUkmq76t_ksx7dTKIG1LuSLmfsSFI6zZ_omCXV2X8rLHitvkiEVqrtb0ocwJl-iEAXlnXayTFFmQIvSfFltADoYeT8mFxUkEOzYLbjtDDGxuhB76iWVXQtl6xkpfZktZlW6_qJedVXa6QkmxPsu14_UMqLkF5YH3v7FV1LKC-AbddrzQKaJ3tJv1VNv2Pj8kRUCZUxwCMlLt8uSTlnhQ7mL6RckfhChVcoYi_-PUJtvmbWOBRaYSN347Hk7YN08cjnK0SIDHm0zotCK1Ga4RugRH6AvfHZnqMMie0nuysJ1JItpQAUuxBqv6LdMjEB3E9Xt8UWogGosoLUEJreEADtNbB6ADwxJLFQHnSrsaYd4Tu-HvgA0reUTKhijtK_mTuB5OXO_jyI_jyHhzVr0k91jPGQ7ewplE5uku3UCdZJrlI8uWxAPD2F6luiar5p1Q8EZByn5wu9tFFuoXbT6pkPfXIg_Tu9V28Cz909TQUaO4jcFFBPlXWmsP8UwGdFYPGY0AfjkL5ngUuj9nRWd4dpWfcHttmYfugOvUNxUJrmHfBqTTZe9aJEzdz1on4k57BvOP9QIr9qb3meZY9Nf3ZWTFw9KAMt84hD2CGDp3ifgEH68YFxD_FslQW0y-sQU9oRWhF4whU8VJMlzreRbj10UfTL-K4vM1G852hcZYJznw4BktoZfoFc47dImm5yyNFYs5jYZ_z0Rf4lf36qfcfDNgg0cHFOuHBGn2DIBHOTA_owbZAVllDVhmcmbsB09ackoIyBh101gfQ1vbRTpDKw0md0UfQOJuk3FXJk6qKXq8yuMRuiAzfMz2Z8tIOWkCDEUxolacmHls8T4M1dfYWqu_U9SpS16uR-rHnXlK7xbX_X-43iO9P_TDXlom5D9bh_Iw8WKe-oSPFvmXa4y_2hCoMzsQgbNumnEUSQg-JBh5oaA3WwdfBhxRvTOH8Ux7TrYwPyMRUyfmnIr4ci_RzERbwAVgHxgbwg8O4Y_KgbyDtJWmf0KBjAQVwKzDuc53y_C3VzAgQ1hC6DvBq7CXWOjXXU7dZUNZEjsjsrA3A2eAxBqJaEMqzJrGqEJVSbB3zr2pqwN7ZRmO3gMPgkhFlWuu6kZU1dghJTdiL8cEh60B5PyBIdE-PAA-7v3KIc-tOkxhvERoPA3m1rqvHRuMcyDL7yCR6-T92Vl4y88oczMSmEHVRsxlu8jWt6Zqui3omN-2aU8yzCvOyWeYZrRq-5FlTtzSnGYpypjY0o8s8y6t8XdC8WrS8ybOlaGmFeYX1iiwz7JjSC63P3cK60yy5t8lzWi_zmWYNap-OR5Q2jL-iEaTYbj_u__P5_4TGxYFQ-la4lLD4utzP3CZSzpvh5Mky08oHfzcSVNDp1DUxlXv477hIMg3onHXgUKfuCPY3JZ8NTm9-k_1ocLrNe2e_Ig_vKjBGed7QvwIAAP__jDvu_g">