<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/58936>58936</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
"pre-indexed" can be used like "gcc" for ARM
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
UmeshKalappa0
</td>
</tr>
</table>
<pre>
Clang emits like
.LBB0_1: @ =>This Inner Loop Header: Depth=1
add r2, r0, r1
subs r1, r1, #4
ldr r3, [r2, #12]
str r3, [r2, #20]
bne .LBB0_1
and GCC emits
add r3, r0, #1040
str r2, [r0, #12] @ float
.L3:
ldr r2, [r3, #-12] @ float
str r2, [r3, #-4]! @ float
cmp r3, r1
refer: https://godbolt.org/z/qs87jsxda
The above clang code was bloating cycles in the expensive loop count.
We can peephole this to use index addressing like gcc .
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJxtUsmSmzAQ_Rq4dJkSEtjmwMFLtsrkkprUHFMCtYGJjIgkZsnXpwW2Y8-Egkat7vd6rYx6LXda9g3gsfMOdPcLIWL7iG1mmdxtt-xnGokNRBmDSOwj8eG-7Rx86Xu0cGfMAJ9RKrTBZ4-Db8kpndEAUimwPOI7sGySF4sbKxfU-ZJkxEV2NmplwYrpNt_OeDKnPMr3F7z_nwtnVy5Vj3Au4Kom2Sv4tNudSr6xULbhmXnnjENclrHZIQSdHPg5MLvOjSyhSwdtpD_3T1Bf5nMo6hYsTuDFCf0G_D7aBZCFMnn6DlIfh-sKbuq2eJin1Ho_uJAW_0hvY1RltE-MbUj7Q99vt149uhclb5pz3yLIyjwh1NPG1EYhPEsHVQjfhZvXWqMD6Hrw5IwvA_auI4AOW1KbsffJNeMDUckeBsShNRoJRHvlDYwOiUPhS1gfi84F8mk1m7qGG4pZxlimy-U6XQmxYrEqhSpEIWPfeY1lxPlgcTERoiJtClphCKNOG885MQfTwVjYfP8GEI9Wl28a1fl2rJLaHEnR-un8WwzWPGLtSe2cG9HRIV8XYhm3ZcZyrPNcsfWaDvJQSSXyXORLVCxdVXWsZYXalTRcCt_jM0wUdKYBx13JGedpeAQvBEtWhUglclFny0ISL00fj7LTScgjTDC25ZRSNTaOjLpz3v0zSmpl0yNO4Yhfjr41tvxxRNd-lVoOg2TxlEA5FfAXbEQqIQ">