<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/58936>58936</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            "pre-indexed" can be used like "gcc" for ARM  
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          UmeshKalappa0
      </td>
    </tr>
</table>

<pre>
    Clang emits like 

.LBB0_1: @ =>This Inner Loop Header: Depth=1
  add r2, r0, r1
  subs r1, r1, #4
  ldr r3, [r2, #12]
  str r3, [r2, #20]
  bne .LBB0_1

and GCC emits 

add     r3, r0, #1040
str     r2, [r0, #12]   @ float
.L3:
ldr     r2, [r3, #-12]  @ float
str     r2, [r3, #-4]!  @ float
cmp     r3, r1

refer: https://godbolt.org/z/qs87jsxda 

The above clang code was bloating cycles  in the expensive loop count.

We can peephole this to use index addressing like gcc .



</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJxtUsmSmzAQ_Rq4dJkSEtjmwMFLtsrkkprUHFMCtYGJjIgkZsnXpwW2Y8-Egkat7vd6rYx6LXda9g3gsfMOdPcLIWL7iG1mmdxtt-xnGokNRBmDSOwj8eG-7Rx86Xu0cGfMAJ9RKrTBZ4-Db8kpndEAUimwPOI7sGySF4sbKxfU-ZJkxEV2NmplwYrpNt_OeDKnPMr3F7z_nwtnVy5Vj3Au4Kom2Sv4tNudSr6xULbhmXnnjENclrHZIQSdHPg5MLvOjSyhSwdtpD_3T1Bf5nMo6hYsTuDFCf0G_D7aBZCFMnn6DlIfh-sKbuq2eJin1Ho_uJAW_0hvY1RltE-MbUj7Q99vt149uhclb5pz3yLIyjwh1NPG1EYhPEsHVQjfhZvXWqMD6Hrw5IwvA_auI4AOW1KbsffJNeMDUckeBsShNRoJRHvlDYwOiUPhS1gfi84F8mk1m7qGG4pZxlimy-U6XQmxYrEqhSpEIWPfeY1lxPlgcTERoiJtClphCKNOG885MQfTwVjYfP8GEI9Wl28a1fl2rJLaHEnR-un8WwzWPGLtSe2cG9HRIV8XYhm3ZcZyrPNcsfWaDvJQSSXyXORLVCxdVXWsZYXalTRcCt_jM0wUdKYBx13JGedpeAQvBEtWhUglclFny0ISL00fj7LTScgjTDC25ZRSNTaOjLpz3v0zSmpl0yNO4Yhfjr41tvxxRNd-lVoOg2TxlEA5FfAXbEQqIQ">