<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/80289>80289</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
WRONG code: LoopUnroll / SCEVExpander with i128 induction variable.
</td>
</tr>
<tr>
<th>Labels</th>
<td>
loopoptim
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
JonPsson1
</td>
</tr>
</table>
<pre>
This reduced (csmith) test case seems well-defined, and should print '6':
```
char C = 0;
__int128 IW = 0;
int *IPtr1, *IPtr2;
struct S2 { int f3; };
volatile struct S2 g_1100;
int main() {
for (; C <= 5; C += 1)
for (; IW <= 5; IW += 1) {
IPtr1 = IPtr2;
g_1100;
}
int crc = IW;
printf("checksum = %d\n", crc);
}
```
```
clang -target s390x-linux-gnu -march=z16 -O3 -mllvm -enable-load-pre=false -o ./a.out -mllvm -unroll-max-count=3; ./a.out
checksum = 7
clang -target s390x-linux-gnu -march=z16 -O3 -mllvm -enable-load-pre=false -o ./a.out -mllvm -unroll-max-count=2; ./a.out
checksum = 6
```
However, when unrolled 3 times (not 2 or 4), the LoopUnroller creates a prologue loop, which is supposed to run extra iterations, as computed in the preheader (LoopUnrollRuntime.cpp:766):
```
for.body5.preheader: ; preds = %for.cond2thread-pre-split
%2 = sub i128 6, %.pr121517
%3 = freeze i128 %2
%4 = add i128 %3, 18446744073709551615
%5 = urem i128 %4, 3
%6 = add i128 %5, 1
%xtraiter = urem i128 %6, 3
%lcmp.mod = icmp ne i128 %xtraiter, 0
br i1 %lcmp.mod, label %for.body5.prol.preheader, label %for.body5.prol.loopexit
```
The constant used for %4 is actually is supposed to be i128 '-1', so UINT64_MAX (i64 -1) doesn't make sense.
i128 <> i64, after LoopUnroller:
```
for.body5.preheader: for.body5.preheader:
%2 = sub i128 6, %.pr121517 | %2 = sub i64 6, %.pr121517
%3 = freeze i128 %2 | %3 = freeze i64 %2
%4 = add i128 %3, 18446744073709551615 | %4 = add i64 %3, -1
%5 = urem i128 %4, 3 | %5 = urem i64 %4, 3
%6 = add i128 %5, 1 | %6 = add i64 %5, 1
%xtraiter = urem i128 %6, 3 | %xtraiter = urem i64 %6, 3
%lcmp.mod = icmp ne i128 %xtraiter, 0 | %lcmp.mod = icmp ne i64 %xtraiter, 0
br i1 %lcmp.mod, label %for.body5.prol.preh br i1 %lcmp.mod, label %for.body5.prol.preh
```
%4 is later optimized to a sub i128 with a folded constant of 18446744073709551621, which really should be '5'.
@nikic @boxu-zhang @xiangzh1 @preames @uweigand
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzEVsty2zYU_Rpoc4caEnxJCy1sKW7TaZNMHk13Hgi8EtGAAAcAbcVf37mgXlZsN2kW1XhogjjnAPcJCO_V1iAuWHnNytVEDKG1bvGbNe-8tyabrG3zdfGxVR4cNoPEBhifSd-p0DI-h4A-gBQewSN2Hu5R66TBjTLYML4EYRrwrR10A71TJgDjdcV4zfIrlq5YenhW6f4vDmUrHCyB5StIWX49fry9VSZkfAavP1_MjLpXr98Fl9Gi-3d-BIxPH9wgA3zgwOprINImZ_k1sHp1RN5ZLYLSCCfw9jbL0vRCi9idUIbxGbmB1ftZgI115CISJguWtNVyP-LXNMoYnx_Qj_DRsBOBhifG-Rr0i9ZGRzy2dZy92DR9IjP3A9q9dHJkfz5DxRhtolFctii_-KGLKMbLhpVLwzgnD0snyYijTw7SF3G8HI7B1cJsIQnCbTGAz-fpLtHKDLtkawZIOuFky_LVQ1ZB8jYHSDqt7zpI0Ii1xkRb0SS9Q5avNkJ7hMTClPEbMbVDOIIH46zWSSd2ibSDCSxfxVgfkYdEO7Ox_n_3x_9tf9ULTv3V3uMdOorNfYsGRn1sIIegOvSUYsYG4GAdFBQ7voTQIvxubf9pBDuQDkVADwJ6Z7XdDgja2n5UVbIF5cEPfW89NhAsuMEA7oIToAI6EZQ1Pla9B2m7fgjYgDJxnd5hi6LBmOynRd8PhvY3lX3P8qu6qmJavdQbNtZNqSmV06Mky6-AnNc7bPwhXQknrWl4aB2OIUl8r1U45DrjJY9gP6xBUWOpxuZByhnPyqw-g-YRunGIDziiiX8GKCJANM1xNie5bFYUVV0UaZ3X6bwssyorz1hlZA0OuyOtIFp-hqm-US6j8h7CeEkhoAh8K1ZdimnZ9dPONhGqZNeDOdlz0CHS3t2wdqCycyZNarFGffDyIRpWn4XkBRClFO4OgXgynT-2CNIaH4QJMFC2jW2yLCgFhQyD0PrrZTquj5bUSUaHDF-Ct_Dp9ZuPVXH7x9VflHuqKiCJDbWx6A3jNXXyL3SAGY_TR00-ilFHfgWqinERG3LzedGcsvUpQ57J1mc-P_f7gZT9llsv4ZJZFf8h15_f3WGFx7yq-IkSuVA-Y46ykZhk31NJJ5VzyCjz3bV2Eqkut_KoGL-nGk9aTyBHzZ8o28ugPM0cl3mq3n-84ON6P8x6oWgOda4F-cb2QXXqYSxxcUr9exVaELCxusHm1C3s5omE4tnpDHMYe8f-VrpGahcl4_Wj0mdFatQXJYEV6druhuShpVsBK9KdEmb70Gb03jsU8XAt0uEe1ZZuu49U4nPSLPJmns_FBBdZnVb1PM95NWkXiPOy3IgqmyNyiZjP57ws0nKzbupZUcuJWvCUFylPsyzPyiyfzqpGFFkz2-QVF5WUrEixE0pP6VYxtW47Ud4PuJilfDafRPf7eLXnnLpudCVd4crVxC2Ik6yHrWdFqpUP_qQSVNC4-Pz-7ZtfQNoGqTmdmh4wfgMflq_-fLXrhaEzPcYiRkWZZpB0EYA74RTdiKaTwelFG0LvqVfyG8Zvtiq0w3oqbcf4Da26_5f0zv6NMjB-Ew3xjN9EW_4JAAD__4LffzI">