<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/59075>59075</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            CPU2000/172.mgrid performance regression after D137913
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          vzakhari
      </td>
    </tr>
</table>

<pre>
    CPU2000/172.mgrid slowed down by 10% on Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz after https://reviews.llvm.org/D137913

This is a Fortran benchmark compiled with Flang.  I am using a custom driver, because flang-new is not properly working right now, so I am not sure how to share a reproducer.  I will try to give as much detail as I can.

The major difference is in `resid_` routine.  VTune profile shows 16.225 seconds before the change, and 19.035 seconds after the change for this routine.

There are two loops that show major time difference in the attached files.

[before.dis.gz](https://github.com/llvm/llvm-project/files/10045509/before.dis.gz)
[after.dis.gz](https://github.com/llvm/llvm-project/files/10045508/after.dis.gz)

Loop1 (14.155s vs 12.715s):
before.dis: 0x4be0 - 0x4d6d
after.dis: 0x4f90 - 0x5115

Loop2 (3.715s vs 2.55s):
before.dis: 0x4ec0 - 0x506f
after.dis: 0x5240 - 0x541b

Loop2, in particular, does not look better to me after the change.  The amount of retired instructions increased from 23960M to 27600M.

@LebedevRI, can you please take a look and see if you see any obvious issue?  I will prepare LLVM IR for the related functions for reproducer.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJytVU2P2zYQ_TXSZbACSX3YOviQZrHpArtAsUiD3ApKHEmMJdIgKaveX9-h7HjrbYAiQABBJsXhe2_IN-PGqtPu4x9_CsZYIh74RmRT77QCP9oFFSi7GGhOwGm1BGvg0QQcE7F9SUQNX9Ga7-NPdlRQ5fkWCA6SgoHIGPv0-yvILqCDIYSDT_IPxEKPw6PGxWfjeJwy63r6dM_zTc3zhN0n7MP5_XnQHuiR8GBdcJKkoGmHSbo9tHY66JEkLjoM8DBK02cAjyAnmL02PW1qZx_sBMrpI7pEfKTdrZw9Qhej7wwuEdzYAAdnD-jGEyzW7eNmp_sh0NISt3l7xo2RfnYIg10gWPCDpIkEh7RfzS26VcGixxGCO8WQnqhBepjmdgCFQeoxTh-hlSa7TRVhkt-sA6W7Dh3liVGdNpBUzKHX6i8agLNz0AaJ6Mvn2WBU3tExkBa7eOBVJuiePLbWKE_5dpYUBsJuB0oZYzbSKOB1xvK3uPMNvYUBbaMp0X-neyc1ph2BFwujtQdPwTKsGi5JBD3hTSZmhZchyHagS4ua_Q1qUv52lpsp7bP-NSnvyVq3runpqucmo6unSbTO5eeOTuEbtoGmK3B0MmNFWbKahrewor7yrWn_UrotDW9Qr2zr-4nOigPx8CLjZenhSFcmsg0vfYzML2FvgukTsL-LBhncxYGq1DnkSnKJ6OpzRMl5-Z5RRMZ8ZYmEIiv_hw7bCxiruh_SlaK4RBS8-Q9dNBnd90G6oNt5lGvpKYvnUiO_7MmYYXWcBbLJe_eRt2M1yMnOJoDtqL6CdmQabXxwcxu0NbEwWofSRy85KnKR1xV7johiUzH2fOutgj1hgwqPL49RDBUfnOwMhzEiQJD7WMarslgdHsmw3RoRh9KcwDZHbefYjPyMSf5wrfMDFX8shaenL8_w-HKpHCTNowxR3GwuguPKvzpFijteVdstJ9ewVO1yVee1TIMOI_6wI1ODIohJxnJy2FNL8IR7Ob1L80xnN-5-2sRrUtHFZc02ZTrs6qqs8w0WJceqwlJ0FS_adiOrKq-U2uTpKBsc_Y5qKBHi3EfjuQhBZZTqnWBCcM63Ii-KgmV10-WyJYgN5y0ra7oOnKgTXrt_6narpGbuPS2O2oe3v4ZUUqK9QVzpCF_OYbBud3yVe2rAOl25d6v2fwDYWyeN">