[llvm] [LV] Use original trip-count as the vector-trip-count if use predicated EVL instructions for tail-folding. (PR #132675)

Mon Mar 31 07:20:52 PDT 2025

NexMing wrote:

> > > > I thought the consensus is that we want to keep canonical IV alive through the entire duration of loop vectorizer. And that's exactly why I chose to add a Pass right after LV to replace canonical IV with EVL IV in #131005
> > > 
> > > 
> > > I understand but why we should keep the canonical IV? I did a similar change based on llvm-epi and haven't encounter any problem yet.
> > 
> > 
> > No bug, but that doesn’t mean it’s reasonable.
> > RVV specification defines: `ceil(AVL / 2) ≤ vl ≤ VLMAX` if `AVL < (2 * VLMAX)`. But currently, all RVV implementations simply use `vl = min(AVL, VLMAX)`. At least, I haven't encountered an RVV implementation that isn't done this way.
> 
> At least QEMU supports it via `--rvv-vl-half-avl` option. Plus, the spec is the spec, we should generate correct code, which should be portable across different RISC-V hardware, that follows the spec

When using EVL to fold the tail loop, it is reasonable to treat the step of the canonical induction variable as EVL.

https://github.com/llvm/llvm-project/pull/132675