<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/101064>101064</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
do and while loops generate different assembly and are affected differently by [[likely]]/__builtin_expect()
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
pkasting
</td>
</tr>
</table>
<pre>
AFAIK, the following are semantically equivalent in C++, even taking side effects, aliasing, etc. into account:
```
do {
BODY;
} while (COND);
```
vs.
```
BODY;
while (COND) {
BODY;
}
```
I don't know what the most optimal x86 assembly for these would be, but I would expect the two forms to both result in the same thing, at least when BODY and COND are simple enough to avoid confusing optimization heuristics. However, at least on x86 -O3, clang generates very different code for these.
Complicating matters, the first form is unresponsive to hints that the conditional is unlikely to be true (and thus the body will only execute once), while the second form does respond:
```
do {
BODY;
} while (__builtin_expect(COND, 0)); // Generates same code as without hinting
BODY;
while (__builtin_expect(COND, 0)) { // Changes codegen
BODY;
}
```
Note that when using the C++20+ `[[(un)likely]]` attributes, the behavior is similar:
```
do {
[[unlikely]]; // Unintuitive, but apparently correct, way to annotate this loop
BODY;
} while (COND); // Generates same code as without hinting
BODY;
while (COND) [[unlikely]] { // Changes codegen
BODY;
}
```
It seems like there might be two distinct bugs here, but I'm not sure, so filing as one for now:
1. `do` and `while` loops should probably agree on what assembly to generate
2. It should be possible, at least with `__builtin_expect()` if not also with `[[likely]]`, to affect the codegen of a `do` loop
For a test case, you can look at the generated assembly in https://godbolt.org/z/axhK7rP1c; `e1(), e2(), e3()` are `do` loops that all output the same assembly regardless of hinting, while `f1(), f2(), f3()` are `while` loops that output different assembly than the `do` loops, and respond to hints.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJysVs1u4zYQfhr6MlhDph0rPviQOHAbLLDbSw89LShxJLGhOCo5suJ9-oKULCepi7boAoJtSSRnvh9-tArB1A5xL-4exd3TQvXckN93LyqwcfWiIH3ePxwfnj8LeQBuECqylgbjalAeIWCrHJtSWXsG_KM3J2XRMRgHByEf03UAPKEDVi9xVjAaAasKSw7xnbJGBePqNI7LJRjHBKosqXcs1g8iexLZg9hm05VuNYHIH8ffAI9fn34T6-lW5E8wNMYiCHl_-PrlScjd9eX7ZU5hefP5uwU_Lvam8sfCNxcbP59BkxMyZ3hxNMDQKE50thQYqGPTKguv91tQIWBb2DNU5OOIgDBQbzUUGCkqeobn6Qm-dliOy_BAcUIbgAkK4gY8ht4mIeL7oFoEbiaeFYNFFRiGBl1CAcppiABHVU3bWQR01NdNXFGdyGgoyVV91Gps2HxXbMhBg703gU0ZlvAzDXhC_64IuQTs09d1fFxa5Wqo0aFXjAFO6M-gTVWhj8YpSeMV-vIthQdqO2tKFY0JrWJGH2ZXGh84MQAmQO88ho5cMCeM7TfGcQC-cF6S0ya2ruw42poXtOdEHQL7PskdGeGmD2lK3AcwGGuBXHT6K5Y9I5ArMfpLHibPJa4xFhib0YQBxmb0_zXzt29Fbywb920UfrbkAbLURPQ5CHkU8gg_zQQn6ROtKsBguKGeEyPRDG_ovW36fy4am4dL2UOjXI0h1avR_fd98oUYR6WSN0e7RVanOJGZkI8QJ8XAehTyvndC7kYFxd1TvLYZKGZvip5xdkiBjToZ8lHxYFpjlf93goyFLiaZSqyvkH91xnFv2JzmDaq6TkU32zOU5H2i7QCDShZTzhGrhNIEsETd37B0K8bgR-s7h9oNlD9W2WeGgNgGiCWiJB6hNXXDadMNBDqGiCsZir4OEN_PgSdk3oIjhtCPTwNBZWw6hQKQGyPD0TBLulpGk2hKZnA63iTM8T5yHiA0KUQ7T4WKeatqj3FHj9k8xzDTHFbjynIJEUozhTJ0FIIpLL4PVsNNrHlj-0QhtxmYKgFSNtA8ehThg5eTgQlUOjKn_EoSAFWgriivTho_j-RBAWNgKFVI7Z2ph1K5OPQFpjC8gNNXyMZBw9yFSGYSvyZdkOUl-VrI43chj-q1-Zz7X1ZlipxthqsJWTzF5Zvf6yvieLK863aKZBVTteeu5-tRNffisVZeWwwhwr3Yes5bsc2qN6WrN6Wrv5b-4IBUfap8PYGuyjdqPDzfN510dvqS6vP5slzo_Vrv1ju1wP0ql1Lmu00mF81e42ZdlfluvdZ3Gcr1XXkvVb7N8l1ZlbvNbmH2MpObLJe71f3ddiOXeX6X6XxVyazarFf6XmwybJWxS2tPbZRhYULocb_KVtl2s7CqQBvSfzgpHQ6Q3gop4186v4-TPsVNJTaZNYHDdRk2bHGvKSEaKR25udjiFjFxbGR0tCTq6xh7huIMN2wsj7d3wqL3dv_BbIabvliW1Ap5jI1OX586T7-niccELwh5nPCf9vLPAAAA__90dlcs">