[llvm] [X86] combineEXTRACT_SUBVECTOR - fold extract_subvector(subv_broadcast_load(ptr),0) -> load(ptr) (PR #126523)
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 11 08:32:46 PST 2025
================
@@ -58477,10 +58477,26 @@ static SDValue combineEXTRACT_SUBVECTOR(SDNode *N, SelectionDAG &DAG,
DAG.isSplatValue(InVec, /*AllowUndefs*/ false)))
return extractSubVector(InVec, 0, DAG, DL, SizeInBits);
- // If we're extracting a broadcasted subvector, just use the lowest subvector.
- if (IdxVal != 0 && InVec.getOpcode() == X86ISD::SUBV_BROADCAST_LOAD &&
- cast<MemIntrinsicSDNode>(InVec)->getMemoryVT() == VT)
- return extractSubVector(InVec, 0, DAG, DL, SizeInBits);
+ // Check if we're extracting a whole broadcasted subvector.
+ if (InVec.getOpcode() == X86ISD::SUBV_BROADCAST_LOAD) {
+ auto *MemIntr = cast<MemIntrinsicSDNode>(InVec);
+ EVT MemVT = MemIntr->getMemoryVT();
+ if (MemVT == VT) {
+ // Just use the lowest subvector.
+ if (IdxVal != 0)
+ return extractSubVector(InVec, 0, DAG, DL, SizeInBits);
+ // If this is the only use, we can replace with a regular load (this may
+ // have been missed by SimplifyDemandedVectorElts due to extra uses of the
+ // memory chain).
----------------
RKSimon wrote:
The actual vector load size is the same as for the broadcastf128 - and we already do this in other places - so it should be OK?
https://github.com/llvm/llvm-project/pull/126523
More information about the llvm-commits
mailing list