<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=http://email.email.llvm.org/c/eJzNV8ly4zYQ_RrqgpKKi9aDDpY9rpnKTHKYueTkAommCBsEWAAoL1-fblCiJVryOJlUJSwtFBto9PK6-yk34nn91ZgHxj3zlXTMybpRwAojIEqvmfSs4KpoFffgcAUw3dY5WMat5c8MFNSgvWM5-EcAzaJ5DPhmXAu6z_EdZVdRfBPFh098El5F0-yfpJnUhWoFsCi7LpwXAsoo-3S8y3nbFp7dGsOixaZ7xvAqKm7ZSzTbZNHsJsr2kmhxdL_fL8iO7KrxVsiyvPPo6Qu6uCyMdnvF6VVOPp88wSWr0yPPqLJo-A0DNmZ5fy4tjdJbfMkSb5a05prFpO7uLm-l8lLftdoCRx9yRaag7GS7Bd9azeyxX2wQxWMnrxXXW7YFDTakq5S6yyQrje3Si6e0TuIq3Dr-IyUFeCg6zZkzqvXSaJZzB4LhTW0EJt4yqXdgHVxKJHf1PshdRClyIYYYPlJ9-htdDM6duRAILJrGH1TThykkpc3Dt3WS1mJiTuW12fHcoZw_kXw8T6bLVTKdr5I0m8-SRRrPTjfIulVBYbeB9J7IMTfvZOKzeQSMWaigkn1hrS5MTZUSSmh_j0FWmCFa9CVKF4LBUwMFLcFixMITBpxGgSeU6y2EvRjskNIJ27T9qpBArLhiAADOMH1bqlYUurZpDGZWPTMFzjHTeFlz1Zf6OQRoo8cHFGBAvGyULLiXOziAghA1wAlWiFSol6MiYC2qCuq79iHkTjo6w4JDhcxVplWCVRxVvoA1-LzmUgsK3ur_jDhEFHsPID0i9_KLiCw-iMg9IMOGdy0RA7mrbC__Gxj-LmtJWS0QDKQXHTqGDMEM09Y0oN1rfwlgyv7TVr8zUiCCET_Lt636uMUeLS-N-ZdGgTszCl43UJzO2sUkbsNRuaGba9fdpJvwGhxKV-eeHA4MCkYflXOZ_VFB3x5El0bpQubeJq0vsi44_7jGDkX2MTUnbjatqwJQk-kFQf50QTDE-s8GBLZLfywflvM9kiK6Jl83m_gu-9Va_knzuFjrfS0Pi7yom16AGtML5yXT0IyT4Wazw_kS4nyugzyZ7lTIg3I4RL0LRvpO7g9XwECGpfrpB7WJLxoxyJB4NuwzcGr2qOIGGl_houSC7aLLzDDlyE9DsLqaUEZv3-CIC9Gp6Ox_431zCM7ZA-71cerTY9ezvmzeHOUaUrYcwNM0vSHnBUOkv9-jfzdeFh0xcLwObTrMdihLWUjiG3u6bspXrn4g9DiFJ8fKrpAkoCYbegInltJwKx3O6kfpK2KMHV_suAanBwQc_JowbCw0-AFq5DoaCjSC2-fQ64zFG2RBYT7QPwxTA3vAEU9GeSuLh47xEAei3fyRowmmpIMQd4E_1dK1xAmc552RyDU8uUcqwtFSd3MDnUI6tDpxbGtEbpBrINt6IKBV3jeh4QV6vpdOjEXg3L7Q-xv8mXz7rVqMxDoTq2zFR7z1lbHrLbzwMYao5nrUWrUeaMIwtfkEA4c_lNodvsaNNffI7Oi_gHMt8bXbWZYmi1G1TtPFKhF5miNw5rMSZllRprNiCcVMzJfTfKR4Dsqtce7h2Bv9-pFyncZpGifJLInTRbqcJNkiWSVlli3i2SLOObZqomBqQnooKiO7DirzdutQqKTz7lXInZNbDXCw0EuvYP39DFmg4dcYSpylduwtD-kinvm0nI_n01EwdR3s_AtlYhd7>53217</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Suboptimal code gen for pointer subtraction on x86-64
</td>
</tr>
<tr>
<th>Labels</th>
<td>
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
geza-herman
</td>
</tr>
</table>
<pre>
Look at this simple code, it calculates the number array elements between `e` and `b`:
```cpp
#include <cstddef>
struct Foo {
char z[3];
};
std::ptrdiff_t size(const Foo *b, const Foo *e) {
std::ptrdiff_t r = e - b;
//if (r < 0) __builtin_unreachable();
return r;
}
```
Clang generates fine code for this (using `-O2`), a solution based on modular inverse:
```asm
size(Foo const*, Foo const*): # @size(Foo const*, Foo const*)
sub rsi, rdi
movabs rax, -6148914691236517205
imul rax, rsi
ret
```
However, if I uncomment the commented line, I'd expect that it doesn't change the asm code. But it does, and clang generates a larger and supposedly less optimal code, a solution based on non-modular multiplicative inverse (modular inverse still can be used, as the division result should have zero remainder):
```asm
size(Foo const*, Foo const*): # @size(Foo const*, Foo const*)
mov rax, rsi
sub rax, rdi
movabs rcx, -6148914691236517205
mul rcx
mov rax, rdx
shr rax
ret
```
Similar case, suboptimal code gen happens for this code:
```cpp
#include <cstddef>
struct Foo {
char z[3];
};
void bar(std::ptrdiff_t);
void foo(const Foo *b, const Foo *e) {
std::ptrdiff_t s = e - b;
for (std::ptrdiff_t i=0; i<s; i++) {
bar(i);
}
}
```
The generated code is this:
```asm
foo(Foo const*, Foo const*): # @foo(Foo const*, Foo const*)
push r14
push rbx
push rax
sub rsi, rdi
test rsi, rsi
jle .LBB0_3
movabs rcx, -6148914691236517205
mov rax, rsi
mul rcx
shr rdx
cmp rdx, 2
mov r14d, 1
cmovge r14, rdx
xor ebx, ebx
.LBB0_2: # =>This Inner Loop Header: Depth=1
mov rdi, rbx
call bar(long)
add rbx, 1
cmp r14, rbx
jne .LBB0_2
.LBB0_3:
add rsp, 8
pop rbx
pop r14
ret
```
Notice the same, less efficient number of elements calculation.
And there is a comparison with `2`, and a `cmov`. These seem unnecessary (sorry if this is some kind of trick that I'm unaware of, or if I misunderstand the intent of these instructions).
godbolt link: https://godbolt.org/z/zMeY1MKh7
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzNV8ly2zgQ_RrqgpKKi9aDDpYdV1KTzBySS04ukGiKsEGABYDy8vXTDUq0REuOZ6maYVEixAZ6fb0oN-J5_dWYB8Y985V0zMm6UcAKIyBKr5n0rOCqaBX34HAHMN3WOVjGreXPDBTUoL1jOfhHAM2ieQz4YVwLWuf4ibKrKL6J4sM3vgl30TT7N2kmdaFaASzKrgvnhYAyyj4dn3LetoVnt8awaLHp3jG8iopb9hLNNlk0u4myPSVaHK335wXpkV013gpZlnceLX1BE5eF0W7POL3KyeaTN7hldSryDCuLit8wYGOW93Jpa5Te4i1LXCxpzzWLid3dXd5K5aW-a7UFjjbkilRB2slxC761mtlju9jAi8dGXiuut2wLGmwIVyl1F0lWGtuFF6W0TuIuPDr-IyUGKBSN5swZ1XppNMu5A8FwURuBgbdM6h1YB5cCyV29d3LnUfJc8CG6j1if_kYTg3FnLgQCi6bxB9n0bgpBafPwtE7SXgzMKb02O547pPMnoo_nyXS5SqbzVZJm81mySOPZ6QFZtyow7A4Q3xM6xuadSHw2j4A-CxlUsi-s1YWpKVNCCu3X6GSFEaJNX6J0IRg8NVDQFkxGTDxhwGkkeEK53kI4i84OIZ2wTdvvCgHEjCsGAOAMw7elbEWia5vGYGTVM1PgHDONlzVXfaqfQ4A2enxAATrEy0bJgnu5gwMoCFEDnGCGSIV8OTIC1iKrwL4rH0LupCMZFhwyZK4yrRKs4sjyBazB9zWXWpDzVv9nxCGi2HsA6RG5p19EZPFBRO4BGQ68q4kY0F1le_pfwPB3WUuKaoFgIL5o0DFkCGYYtqYB7V7rSwBT9p-W-p2RAhGM-Fm-LdXHJfZoe2nMv9QK3JlW8HqA_HRWLybxGLbKDS2uXbdIN-EeCKWrM08OGwY5o_fKucj-qKAvD6ILo3Qhcm-D1idZ55y_nWOHJPsYmxMzm9ZVAajJ9AIhf7pAGGL9Vw0Cy6U_pg_T-R6HIromXzeb-C77p7n8i-JxMdf7XB4meVE3PQE5phfkJdNQjJPhYbPD_hL8fK6CPJlOKuSBORy83jkjfSf2hytgIMNU_fSDysQXjRhkOHg27DNwKvbI4gYaX-Gm5ILuoovMMOQ4nwZndTmhjN6-wREXomPR6f_G-ubgnLMC7vVx6NNj07M-bd6Icg0xWw7gaZpekfOEIdLfr9G_Gy-LbjBwvA5lOvR2KEtZSJo39uO6KV9n9cNAj114cszsCocE5GRDTeA0pTTcSoe9-lH6iibGbl7sZg1OLwg4-JgwLCzU-AFqnHU0FKgEt8-h1hmLC5yCQn-gfximBvaALZ6U8lYWD93EQzMQneaPHFUwJQlC3IX5qZaupZnAed4pibOGJ_OIRRAtddc30Cgch1Ynhm2NyA3OGjhtPRDQKu-bUPDCeL6nToxF4Ny-0Ocb_Ey-_VYtRmKdiVW24iMvvYL19zMtkEp6Y0gdS0XGWx6UoOnpaTkfz6ej1qr1QCT6s80n6GH8odTu8Bg31tzjCEh_GpxrabC7nWVpshhV6xTiEsRsxcUSSrGaxYvZbJpCWUyTdFVAPlI8B-XW2CCxP47kOo3TNE6SWRKni3Q5SbJFskrKLFvEs0Wcc6zINGmpCQkm40d2HXTI261DopLOu1cid05uNcCBP299Zex6Cy98jJipuR4FlddB3z8BtdcCEQ">