<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/64736>64736</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Negative effects of using shifts for arithmetic optimization
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
ZY546
</td>
</tr>
</table>
<pre>
Hello, I noticed that the optimization of some expressions is inhibited by using shift optimization.
Let's look at a simple example:
Example 1: https://godbolt.org/z/Wfno6jKEv
```c
int var_13; int var_14;
void test(int var_4, int var_5) {
var_13 = var_4 + var_4 + var_5;
var_14 = var_4 + var_5;
}
```
Clang16 -O3:
```
define dso_local void @test(int, int)(i32 noundef %var_4, i32 noundef %var_5) local_unnamed_addr {
entry:
%add = shl nsw i32 %var_4, 1
%add1 = add nsw i32 %add, %var_5
store i32 %add1, ptr @var_13, align 4
%add2 = add nsw i32 %var_5, %var_4
store i32 %add2, ptr @var_14, align 4
ret void
}
```
In this simple example, if `var_4 + var_5` were optimized as an available expression, all computations would be optimized as two additions instead of the two additions and one shift operations shown in the result.
Here is a similar example, but different: in this case, var4+var_4 is not supposed to be evaluated first, but is affected by ReassociatePass.
Example 2: https://godbolt.org/z/8YTevzqbT
**Let's look at an example where the negative impact is most obvious:**
Example 3: https://godbolt.org/z/crxTxPd6r
```c
extern int var_19;
extern int var_20;
extern int var_23;
void test(int var_1, int var_4, int var_6, int var_10, int var_12) {
var_19 = var_1 + var_4 + var_10 + var_12;
var_20 = var_10 + var_12;
var_23 = var_4 + var_10 + var_6 + var_4;
}
```
Clang16 -O3:
```
define dso_local void @test(int, int, int, int, int)(i32 noundef %var_1, i32 noundef %var_4, i32 noundef %var_6, i32 noundef %var_10, i32 noundef %var_12) local_unnamed_addr {
entry:
%add = add i32 %var_10, %var_4
%add1 = add i32 %add, %var_1
%add2 = add i32 %add1, %var_12
store i32 %add2, ptr @var_19, align 4
%add3 = add nsw i32 %var_12, %var_10
store i32 %add3, ptr @var_20, align 4
%factor = shl i32 %var_4, 1
%add5 = add i32 %factor, %var_6
%add6 = add i32 %add5, %var_10
store i32 %add6, ptr @var_23, align 4
ret void
}
```
In this example, both `var_4 + var_10` and `var_10 + var_12` are available expressions. But because they have a common part `var_10`, they are not optimized as available expressions at the same time. From the optimized IR, it seems that the compiler wants to treat `var_4 + var_10` as a available expression and forgo the optimization of `var_10 + var_12`. However, `var_4 + var_10` is not optimized because of the shift operation. This leads to the end that both `var_4 + var_10` and `var_10 + var_12` are not optimized, which does not look like a good result.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy8V12PmzgX_jXOzVEjMIQMF7loZhq1el_tVtVIq-7NyOBDcGvsrG2Smf76lYGEj4HurCqthEKMz7c5z3Ng1oqjQtyRzZ5sHlasdqU2uz-_buJklWn-svuIUmpC7-ETKO1EjhxcyRy4EkGfnKjED-aEVqALsLpCwOeTQWuFVhaEBaFKkQmHHLIXqK1QR7ClKNxIeU2CBxK8b3__j47QrQWp9XdgDhhYUZ2kt8z8nUTvoZX80D6A0D8qnTtZEr0n9EDo4ah5pqVba3Mk9PCD0MMfhdLJt_99OHe-kqC98nYtlIMzM09hRKI93FYxifbD4M5acHBoHaF3V6HYl-e62BCaAtl2SgDQWQUSPbTSQOh-8m9z83JTiGcUejGyfZik0S7vJVPHMIF3v0e-FHMiHAuhELjVT1LnTEKTEomDPqsuH0JTv44oKF0rjgUQuukTfv28yb0x-lQrxSrkT4xz05cDlTMvt8DA6zHOm0xtKUHZS2N26CYcy4aNsFcaCDPOvegtik7DOm1wIBN6oZMzPtnuqOk9MCmOCuKxGzrnpkvxvo9vwRGdOopnHBl0Tel_fqCfFLhS2GkL-PoXQJJg-oIkAVzQ3FoTOTALTAE7MyFZJof92QYlIdfVqXZNI1q46FpyyCYW3EX7aohWRijrkHHf8h4GxptMcdAKb12OprNsS31RIFSjY9DW0o36_qOPW9i234VkZphtVjvgoijQoHK-3UVXmJzZRuDMTEzovq2HsB6swNank7YesbTPCM9M1sxDUSGMdVez3mVRYN5h1Bdk1upcMIefmbXrMdTQt0DN3ddHPP_4K3scptf9Un-9gjh1zRUupS-DL5HCI3PijCCqE8ubOCttHejsLHTd-m-uUYDRWwLMzfPj82eemHksxGeHRvUgmN5wZ7JDg8Wd6LYzB5nhEDJH-JkMF2EwWtF5bE1vUBnOYGsY9H_pK5ylQa_8c8E5BB-oJL3r_xqlZ28L0B0uQPcSpCcLz7uTmdmgv0AC_j4A29bJFG2nRDBLAuESnk_ooI_6rVieLpJGtEQaIR26CpZcRRNXNJh3VbDcaXNjzX9gzM009VZ9EFEyVkhmarV5UwLJNIE5gv13vDekAO3K15QXep2GdLqtcRv7PYOz9GfXsK8dZJiz2jaQ-wIlOyMwz4iVVnBixvVmfWz0vpXzNj3DjGl2zgl0s7JlFYITFa7hYHQ1nJ-Rw6cvTTs5sIiV7SdsT81CooELU856HnMGmVssg2fPuTCaAhXaHPXs4L5QujV81Bc8Y_uyzLvsqLZP5VrPbjiYjAFrePTHKpHxNp0SAVX3TfErJzwKwsd7KUVeAtfYBthQrRTf_fEetebXCWTFdxFPo5StcBcmKQ3TbZiGq3LHg22UbfhdmNOgiPNtlgUsKHiYRmkUxndsJXY0oFFwFyZhHCdRuM620SZnd2mep3FRcE7iACsm5FrKc-XZdyWsrXGXxNsoWUmWobTNhxelCi_QbBJK_XeY2Xmdd1l9tCQOpLDO9laccBJ3v13HA2yGF-sLPvi-sv64gRnhygqdyEdnvqqN3E2GBOHKOlvnuiL04F11t3cno79h7gg9NAFaQg9NAn8HAAD__1vATyI">