[llvm] [RISCV] Use slideup to lower build_vector when its last operand is an extraction (PR #154450)

Tue Aug 26 16:01:59 PDT 2025

mshockwave wrote:

> For the motivating sequence, have you considered using a normal slidedown sequence, and then a single vslideup (not slide1up) from the source register into the last destination?

I think this is a good idea, and to give even more context: my motivation example is a build_vector where _every_ operands are coming from a reduction -- that is, every operands is an (extract_element X, 0). With this patch, the following sequence
```
// v8[0], v9[0], and v10[0] holds build_vector's first, second, and third operations
vfmv.f.s a1, v9
vfmv.f.s a2, v10
vfrgather.vi v8, v8, 0 // splat
vfslide1down v8, v8, a1
vfslide1down v8, v8, a2
```
Is turned into
```
vfmv.f.s a0, v8
vfmv.f.s a1, v9
vfslide1up v9, v10, a1
vfslide1up v8, v9, a0
```
This patch will only save a single instruction (i.e. splat) no matter how many operands there are. But if we also take the follow-up patch #154847 into consideration, that patch further eliminates all the vector to scalar moves:
```
vfslideup v9, v10, 1
vfslide1up v8, v9, 1
```
And the number of eliminated moves is proportional to the number of build_vector's operands. In other words, #154847 is a bigger win and it's more or less depending on this patch.

The thing is, I don't think we can implement #154847 's move-elimination algorithm with vslidedown, because vslidedown reads pass the VL, unless we...concatenate v9 after v8 before sliding down, which I don't think will be profitable.

That being said, I understand your concern about register pressure imposed by vslideup / vslide1up. Let me think about this.

https://github.com/llvm/llvm-project/pull/154450