[llvm] [RISCV] Move RISCVVMV0Elimination past pre-ra scheduling (PR #132057)
Luke Lau via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 26 01:51:44 PDT 2025
lukel97 wrote:
Oh I think I see the underlying cause of the x264 regression, it's #107532. I.e. the machine scheduler is now more free to reschedule masked pseudos, which results in a lot of vector spills in x264_pixel_satd_16x16
It happens under -flto -O3 **without a scheduling model**. Either applying #126608 or using `-mtune=generic-ooo` from #120712 fixes it, as they set MicroOpBufferSize=1 which gets the machine scheduler to account for register pressure.
Specifically x264_pixel_satd_16x16 is completely inlined, and in it there's a few masked `vslidedown.vi v9, v8, 0x2, v0.t` instructions. Notably they share the same mask so previously the scheduler couldn't have scheduled them past each other.
These must have been acting as a barrier preventing the aggressive rescheduling + spilling:
```asm
2290: 3ed134d7 vslidedown.vi v9, v13, 0x2
2294: 3c8134d7 vslidedown.vi v9, v8, 0x2, v0.t
2298: c900f057 vsetivli zero, 0x1, e32, m1, tu, ma
229c: 5e068457 vmv.v.v v8, v13
22a0: cd027057 vsetivli zero, 0x4, e32, m1, ta, ma
22a4: 029406d7 vadd.vv v13, v9, v8
22a8: 0a848457 vsub.vv v8, v8, v9
22ac: 3a8136d7 vslideup.vi v13, v8, 0x2
22b0: 020a8407 vle8.v v8, (s5)
22b4: 020b0487 vle8.v v9, (s6)
22b8: 0c607057 vsetvli zero, zero, e8, mf4, ta, ma
22bc: ca84a7d7 vwsubu.vv v15, v8, v9
22c0: 0d007057 vsetvli zero, zero, e32, m1, ta, ma
22c4: 4af32457 vzext.vf2 v8, v15
22c8: 96883457 vsll.vi v8, v8, 0x10
22cc: 02850457 vadd.vv v8, v8, v10
22d0: cd817057 vsetivli zero, 0x2, e64, m1, ta, ma
22d4: a28544d7 vsrl.vx v9, v8, a0
22d8: 96854557 vsll.vx v10, v8, a0
22dc: 2aa484d7 vor.vv v9, v10, v9
22e0: c5027057 vsetivli zero, 0x4, e32, m1, ta, mu
22e4: 02940557 vadd.vv v10, v9, v8
22e8: 0a940457 vsub.vv v8, v9, v8
22ec: 3ea134d7 vslidedown.vi v9, v10, 0x2
22f0: 3c8134d7 vslidedown.vi v9, v8, 0x2, v0.t
```
I think before this can land we need to either enable MicroOpBufferSize=1 by relanding #126608 (it might be the case where it needs landed in tandem with this patch?), or choose a scheduling model by default (we might need to add a generic in-order model).
https://github.com/llvm/llvm-project/pull/132057
More information about the llvm-commits
mailing list