<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/100936>100936</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [RISCV] Missing fold with cascade shifts
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            backend:RISC-V,
            regression,
            missed-optimization
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          dtcxzyw
      </td>
    </tr>
</table>

<pre>
    There is a codegen regression caused by https://github.com/llvm/llvm-project/pull/89966: https://godbolt.org/z/G3hjh4KTn
```
; bin/llc -mtriple=riscv64 -mattr=+zba,+m test.ll -o -
define i64 @test(i64 %0) {
entry:
  %1 = lshr i64 %0, 18
  %2 = and i64 %1, 4294967295
  %3 = mul i64 %2, 24
 ret i64 %3
}
```

Before:
```
test: # @test
        srli    a0, a0, 18
        slli.uw a0, a0, 3
        sh1add  a0, a0, a0
 ret
```

After:
```
test:                                   # @test
        srli    a0, a0, 15
        srli    a0, a0, 3
 slli.uw a0, a0, 3
        sh1add  a0, a0, a0
 ret
```

These two cascade shifts can be folded into one srli instruction.

SDAG before ISel:
```
Optimized legalized selection DAG: %bb.0 'test:entry'
SelectionDAG has 12 nodes:
  t0: ch,glue = EntryToken
    t15: i64 = RISCVISD::SHL_ADD t18, Constant:i64<1>, t18
 t10: ch,glue = CopyToReg t0, Register:i64 $x10, t15
      t2: i64,ch = CopyFromReg t0, Register:i64 %0
    t20: i64 = srl t2, Constant:i64<15>
  t18: i64 = and t20, Constant:i64<34359738360> ; 34359738360 = 0xffffffff << 3
  t11: ch = RISCVISD::RET_GLUE t10, Register:i64 $x10, t10:1
```

`t18` will be lowered into `srli + slli_uw`:https://github.com/llvm/llvm-project/blob/991192b211212aa366bf73b993ac444839c10bf5/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td#L685-L689

Two solutions:
1. fold `and (srl X, C), Shifted32OnesMask` into `slli_uw (srli X, C+ShAmt), ShAmt`
2. Do some peephole cleanup which folds cascade slli/srli pairs in `RISCVDAGToDAGISel::PostprocessISelDAG` (Preferred)

Any thoughts? @preames @topperc @wangpc-pp 

Related issue:
https://github.com/dtcxzyw/llvm-codegen-benchmark/pull/25
https://github.com/dtcxzyw/llvm-codegen-benchmark/issues/40
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy0Vt1u2zwSfRr6hrAhDiVZuvCFbcXZYPOhRewtFntTUNLYYkOLAknVTZ5-QUr-SZBNUWA_wbD-zhzOGZ0hKayVhxZxQZIVSYqJ6F2jzaJ21a_Xl9Ok1PXLYtegQSotFbTSNR6wpQYPBq2VuqWV6C3WtHyhjXOdJXxJYENgc5Cu6ctZpY8ENkr9PJ-mndE_sHIENl2vFIFNludpSvjyPYGuS63cTJsDgc0rgc09b3408T93LYkKEi1JGo2_4ZavaCnbMExFp0dnZKeQ8MJIW_1MYzo9CucM4QWB1WspCKwJrI7UoXUzpehU0-nAVONetkhlGlMSR_49gSzcQRIRyCmZrwYkts68-IzDHfUARgkvqLKNodeQNWXZDQYCRrT1GcI8JIY8ztM55MkNlAfosVdnKHgoxCPEoDu_4GMZ5sXH5Qn_K9xrg5eM32GCVL6kBPhF-ZjKcFijpD-LoEm8UzZilJKz_vQGw99BGibq-i2NiK6SPhGw3Ds0v8v_98efKUx-jzkr_HvV7xq0SN1J00rYStRIbSP3ztJKtLREuteqxprK1mmqWxwyla11pq-c1O3slmxbLO9pGQxBH7ao_ldVv3ROHuUr1lThQahwZVFhYKTF8n6wTFKWs4gSmI9fYegNmI-DnQP8oI2wlAFtdY32pntc5JmqhsD6oHoMzr_zLDv9jO21iI4lHhh8zwv69LBdf3vYFp6JL7f_ePy-LArqWOYLu9atdaL1Cck0JnzNCL_zL9zFt459MO5ady87_YQHnxWs6RMepB2sN_Rb_ItFA88bdzgYUyOwrpoL1cbo4ydkSXSjDqJbddYoT_qhlMRrORePZbdhfnLxTB_F8Zgn-ZxnPI0Iv6N-4rx5FMKjX_vxoISvCb8xsWNsKNcHxX-6232_f_zXXajp51XzItknVidp5CWlET1Jpby5lT6hObubpFEwN4FV6Lnv_ckT8OUfr0Kl0iWBTZ4zlkMJjAEDIXialvs5L_OciyqO44znFYvKfXLDI33cTpgDep5QifP5wffcQ7vX_ylnribAH9MsmT6mWf6mnU-aWq163xfXRmCz0Mdeov-KBDLvgX-HT0kg9-et73qsOXxp0f4l7LOv06UuQznGQHmJXG2b5dFdGPz1ueQwo4VP5Yi0Q-warZBWCkXbd_TUyKoJGdnrrKOUJLAJ9J2QxlLZ-qGD9mJ5v9PF8v48p_DlV21dZ3SF1vqHfsrwPoPsq8E9GoO1z-p2lm9fqGt0f2icJXzjp-rOoDiiDbO27jo0lb88ifbQVdOuo7fhT6iE806xtr8udp84Y9zvnM0xbnOmJbZVcxTm-bpZgeT_QRYSswQ2cTSpF7zOeS4muGBzAJZFaTyfNIsKWDav4zJl-z1LMhYnjCVxhSnmUZKmMJELiCCO5pCxhEfxfFZClos6LkHELJ1DTOIIj0Kqmc_D76MmYdwFi6KcpxMlSlQ27PwASlE9Y1v7Hn7YrqffCEDYIMF1q3d5dJTWYj3Vw7og3PguKSZmESSX_cGSOFLSOnsd3EmnwkZz6JSkoH9Ja2V7GOx-kq55t6xNeqMWf9zQl-KOOn8u4L8BAAD__7fXD-Y">