[PATCH] D115691: [PowerPC] Update P10 vector insert patterns to utilize the refactored load/store infrastructure.
Nemanja Ivanovic via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 20 07:23:08 PST 2022
nemanjai added a comment.
A minor nit about the description - you talk about implementing a DAG combine but you implement a custom legalization.
================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:10756-10762
+ // A f32 load feeding into a v4f32 insert_vector_elt is handled through a
+ // DAG Combine on P10 in order to allow this specific insert_vector_elt
+ // load patern to utilize the refactored load and store infrastructure.
+ // First, we convert the f32 load into an i32 load. This is done in
+ // order to produce a v4i32 insert_vector_elt. The v4f32 insert_vector_elt
+ // is then bitcasted back into an v4f32.
+ // VT represents the v4f32 insert_vector_elt, and V2 is the f32 load.
----------------
This does a very good job explaining *what* is being done, which I don't think is particularly useful because the code is rather nicely self-documenting. What is missing is *why* this is being done.
We only do this with ISA 3.1 because on previous architectures, it is cheaper to do an `lxsiwzx` + a permute.
Note, this is also useful on Power9 because direct moves are cheaper than the alternative. So please pull this out and guard with `Subtarget.hasP9Vector()` (and of course change the respective tests).
================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:10766
+ SDValue ConvertedLD =
+ DAG.getLoad(MVT::i32, dl, LDN->getChain(), LDN->getBasePtr(),
+ LDN->getPointerInfo(), LDN->getAlignment());
----------------
This isn't necessary. Let the DAG handle this for you.
- Bitcast the vector
- Bitcast the scalar
- Produce a `v4i32` insert
- Bitcast and return the result
================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:10769
+ SDValue ConvertedInsVecElt =
+ DAG.getNode(ISD::INSERT_VECTOR_ELT, dl, MVT::v4i32, V1, ConvertedLD,
+ Op.getOperand(2));
----------------
It is not a good idea to insert an `i32` into a `v4f32` vector. You should bitcast `V1` as well.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D115691/new/
https://reviews.llvm.org/D115691
More information about the llvm-commits
mailing list