[llvm] [DAGCombiner] Fix subvector extraction index for big-endian STLF (PR #180795)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 11 09:34:14 PST 2026
================
@@ -20831,8 +20831,16 @@ SDValue DAGCombiner::ForwardStoreValueToDirectLoad(LoadSDNode *LD) {
if (!TLI.isOperationLegalOrCustom(ISD::EXTRACT_SUBVECTOR, InterVT))
break;
----------------
Michael-Chen-NJU wrote:
> We entered an infinit loop in one of our down.stream testcases and needed to add this:
>
> ```
> if (!TLI.isOperationLegalOrCustom(ISD::EXTRACT_SUBVECTOR, InterVT))
> break;
>
> + // Avoid infinite loop: Don't transform loads from fixed stack objects,
> + // as legalization expands extract_subvector to such loads.
> + SDValue LDBase = LD->getBasePtr();
> + if (LDBase.getOpcode() == ISD::ADD)
> + LDBase = LDBase.getOperand(0);
> + if (LDBase.getOpcode() == ISD::FrameIndex)
> + break;
> +
> // In case of big-endian the offset is normalized to zero, denoting
> // the last bit. For big-endian we need to transform the extraction
> // to the last sub-vector.
> unsigned ExtIdx = 0;
> ```
Hi @KennethHilmersson,
Thanks for the feedback! I've been testing the FrameIndex check and noticed it's a bit of a double-edged sword. While it effectively prevents the infinite loop you mentioned, it also blocks some highly beneficial STLF optimizations on X86 (e.g., in shuffle_chained_v16bf16), where we were previously able to eliminate stack spills/reloads entirely.
To find a more surgical way to break the loop without sacrificing these optimizations, could you share the specific testcase (or a reduced version) that triggers the infinite loop on your target? Perhaps we can refine the check or do you think this performance regression on X86 is an acceptable trade-off for ensuring safety against infinite loops across all targets?
https://github.com/llvm/llvm-project/pull/180795
More information about the llvm-commits
mailing list