<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/57166>57166</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
DAGCombine carry folding makes code worse in given testcase
</td>
</tr>
<tr>
<th>Labels</th>
<td>
backend:ARM,
llvm:codegen
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
efriedma-quic
</td>
</tr>
</table>
<pre>
Testcase derived from boringssl: [boringssl-bcm-testcase.txt](https://github.com/llvm/llvm-project/files/9339270/boringssl-bcm-testcase.txt). The code is sort of complicated, but I'm not sure how to simplify it.
I've found that if I revert D64190 (e9aed963ce36) and D57302 (bddb8c359739), the generated code for the given testcase is significantly shorter on both 32-bit x86 and 32-bit ARM. The effect is a bit more obvious on ARM, where the mrs/msr operations to spill the carry bit are more distinct, but it's significant either way.
I think what's happening is that the we do some combines that seem to improve the code locally, but end up forcing the scheduler to spill the carry bit, which then makes the code worse overall. Or maybe the scheduler just doesn't realize it should avoid spilling the carry bit? Not quite sure which.
64-bit targets don't have the same issue on the given testcase, but I assume that's because of the integer widths involved, not because 64-bit targets are somehow immune to the issue.
On a related note, we might want to teach the ARM backend to use a different sequence to save/restore the carry bit, if we expect carry spills to show up in practical code. msr/mrs appear to be slower than other ways of accessing the carry bit.
CC @topperc @RKSimon @deadalnix
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyVVdGO2zYQ_Br5hThDlmzp9OCH6xkpgqINkOYHKHJlMaFIh6Tsc7--s5Sd1ElRoMDhJIrL3dmZWbr3-rr_RDEpGUloCuZMWgzBT6L3wbhjjLaoX0Sx--Xb-qlX01O6nVmnt1TsDkX1PKZ0iogtqnf4O5o0zv1a-QkLa8_3x9Mp-M-kEpaDsRTx7Oq6q9oSb_9RourWQnwaSSivSZgoog9J-AHr6WSNkol0Ub2Kfk7ifVG1k3A-iTgHEqO_iORFNBw4XIVJ66I8FOXL8p-jzyQGPzst0iiTMIN4LwKdCRUOzXbTlQL9USdJd02tqG4AR0iEH3ZtXVa822vdP6t617V1h12GkoD2SI4CY1twDz4sn0GzE_cGczvm6MyAPlyyVxFHdEdBeAcZ0ijq6qk3Sbw9N7nsbfny8fcbKTQM4JTzSME7k0ffvj8bP0dOgkhGdBkJ3xnAFJj5KaLEiQEa72Im6WSszRFKhnDNySTO5ITaxGQca7fwbPDWPkAXBNmB-yKvjxwjpXFfAEAuZ0Z5OpGD2Iw5k841L6gBDH5imafeOLptRqKJ4UHC4M9LC5lQ65W09npHRCBnPjHNinNzWFQj6dkC1L-3t_Bi1MifnZjkl1z0lv_iA_RByYAyIPtDQMS1px9Sf55jAnSKDt0leEda8xd0TazkbLWQZ2_0Uv4O7DuC-p0Qf8CtX2eTaPFsRvRAYbPNmicZjpQiii2lRnmjI8qJjRRnYsF_Ntm34RASQRNlYrMWPSk5c5dDPmZcoiNraHQaI5Znb8_LcPFI3aN_wMMmYeF42Mw0zY6Y75yPMT208sHBpoFsngvkzNig_WSOY4J34CM-S3IRhd0reqm-sLrY4OoSZoTnAzk2x9eZnMoFI-iAsQOa9jerPyiN0UYhejvxuCw7WZTF_Awe9jFOnIJUCZa22QYQHqPCAxPQKZwrs5tgg2j9ha01Sif83fqRqZRKUYw_if1AxOurKLZl8sgYFL9-_O1PM0E-vGqSWlpn3sSK9pumKXddtWk3K72vdVd3cpVMsrQ_vPz6uozKrcbgreaqi5H_YWJ09eiI1Rzs_n_f2llOvjx2LVCtxn27bWvZKVk2PTUDbTe6e1a0rTfloDsttysre7Jxj5-QoqpuOqJivpMqFgWPXKt-Ybi4Mvn77rAy-6qsqvJ5s9t0dbMp11rpWg3VbtPUjWypBU00SWPXfHztw3EV9hluPx8jNi0urPh9E8bHVUWUoSC_nBPu2T0NwZCe5BPmT61yf_vc3N_ammkL">