[PATCH] D141079: [SelectionDAG] Improve constant folding in the presence of SPLAT_VECTOR
Luke Lau via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 6 09:27:02 PST 2023
luke added inline comments.
================
Comment at: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:14018
+ // fold it that way
+ if (N0.getOpcode() == ISD::SPLAT_VECTOR &&
+ DAG.isConstantValueOfAnyType(N0.getOperand(0))) {
----------------
reames wrote:
> luke wrote:
> > reames wrote:
> > > This looks to be assuming fixed width splat_vectors. The primary use of splat_vector are scalable vectors.
> > That makes sense, I was wondering what the difference was between a splat_vector and a splatted build_vector.
> > In this case then is it still possible to fold here?
> To my knowledge, we're a bit inconsistent about this. RISCV uses SPLAT_VECTOR only for scalable vectors. Hexagaon (and per your other comment, WebAssembly) use them for both fixed and scalable. I'm also unclear on when they use SPLAT_VECTOR vs BUILD_VECTOR.
>
> Longer term, I do think that having one canonical representation for a splat vector makes sense, and that it'll probably be SPLAT_VECTOR. We're just not there yet. In particular, DAGCombine has various weaknesses for SPLAT_VECTOR that need to be worked through.
This is a RISC-V test case I was able to throw together that shows the optimisation opportunity for scalable vectors:
```
define i32 @f(<vscale x 2 x i64> %a) {
%v = insertelement <vscale x 2 x i64> %a, i64 0, i32 0
%w = shufflevector <vscale x 2 x i64> %v, <vscale x 2 x i64> undef, <vscale x 2 x i32> zeroinitializer
%x = bitcast <vscale x 2 x i64> %w to <vscale x 4 x i32>
%y = extractelement <vscale x 4 x i32> %x, i32 0
ret i32 %y
}
```
After the first DAG combine it looks like this:
```
Optimized lowered selection DAG: %bb.0 'f:'
SelectionDAG has 9 nodes:
t0: ch,glue = EntryToken
t7: nxv2i64 = splat_vector Constant:i64<0>
t8: nxv4i32 = bitcast t7
t9: i32 = extract_vector_elt t8, Constant:i32<0>
t11: ch,glue = CopyToReg t0, Register:i32 $x10, t9
t12: ch = RISCVISD::RET_FLAG t11, Register:i32 $x10, t11:1
```
If I'm not mistaken, it should be possible to constant fold the constant in `t7` into `t9`, but the lack of constant folding for `splat_vector`s in `bitcast`s prevents this.
I guess this is what I was trying to achieve with WebAssembly, except it was with fixed size vectors, so as you pointed out just making a splatted `build_vector` doesn't work.
================
Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:3033
}
+ case ISD::SPLAT_VECTOR: {
+ SDValue Scl = Op.getOperand(0);
----------------
luke wrote:
> reames wrote:
> > You should be able to separate this into it's own patch with test coverage.
> >
> > Note that this code is currently restricted to fixed length splat_vectors - which only hexagon currently uses. You could chose to generalize the routine to scalable vectors if that was helpful.
> WebAssembly now uses fixed length splat_vectors too to aid in selecting splatted loads (D139871).
> Will take a look at generalising this
Writing this down here before I forget:
I needed to provide this information in simplifyDemandedVecElts, because it was used by `SimplifyDemandedBits`, which is in turn used in `DAGCombiner::visitSTORE`
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D141079/new/
https://reviews.llvm.org/D141079
More information about the llvm-commits
mailing list