[llvm] [SLP] Make getSameOpcode support interchangeable instructions. (PR #127450)

Jordan Rupprecht via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 24 22:40:56 PDT 2025


rupprecht wrote:

> > This commit causes a crash when building highway tests, e.g. https://github.com/google/highway/blob/master/hwy/tests/arithmetic_test.cc
> > Some tests (e.g. [unroller_test.cc](https://github.com/google/highway/blob/master/hwy/tests/unroller_test.cc)) fail with an assertion:
> > ```
> > assert.h assertion failed at llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:18508 in void llvm::slpvectorizer::BoUpSLP::BlockScheduling::cancelScheduling(ArrayRef<Value *>, Value *): Bundle->isSchedulingEntity() && (Bundle->isPartOfBundle() || needToScheduleSingleInstruction(VL)) && "tried to unbundle something which is not a bundle"
> > *** Check failure stack trace: ***
> >     @     0x555d3e1577b0  llvm::slpvectorizer::BoUpSLP::BlockScheduling::cancelScheduling()
> >     @     0x555d3e11d390  llvm::slpvectorizer::BoUpSLP::BlockScheduling::tryScheduleBundle()
> >     @     0x555d3e1107f8  llvm::slpvectorizer::BoUpSLP::buildTree_rec()
> >     @     0x555d3e11259f  llvm::slpvectorizer::BoUpSLP::buildTree_rec()
> >     @     0x555d3e111620  llvm::slpvectorizer::BoUpSLP::buildTree_rec()
> >     @     0x555d3e167823  llvm::SLPVectorizerPass::tryToVectorizeList()
> >     @     0x555d3e168afa  llvm::SLPVectorizerPass::tryToVectorize()
> >     @     0x555d3e16ba81  llvm::SLPVectorizerPass::vectorizeRootInstruction()
> >     @     0x555d3e16146f  llvm::SLPVectorizerPass::vectorizeChainsInBlock()
> >     @     0x555d3e15e54e  llvm::SLPVectorizerPass::runImpl()
> >     @     0x555d3e15dc5c  llvm::SLPVectorizerPass::run()
> > ...
> > 3.	Running pass "function<eager-inv>(float2int,lower-constant-intrinsics,chr,loop(loop-rotate<header-duplication;no-prepare-for-lto>,loop-deletion),loop-distribute,inject-tli-mappings,loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only;>,infer-alignment,loop-load-elim,instcombine<max-iterations=1;no-verify-fixpoint>,simplifycfg<bonus-inst-threshold=1;forward-switch-cond;switch-range-to-icmp;switch-to-lookup;no-keep-loops;hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,slp-vectorizer,vector-combine,instcombine<max-iterations=1;no-verify-fixpoint>,loop-unroll<O3>,transform-warning,sroa<preserve-cfg>,infer-alignment,instcombine<max-iterations=1;no-verify-fixpoint>,loop-mssa(licm<allowspeculation>),alignment-from-assumptions,loop-sink,instsimplify,div-rem-pairs,tailcallelim,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;speculate-unpredictables>)" on module "hwy/contrib/unroller/unroller_test.cc"
> > ```
> > 
> > 
> >     
> >       
> >     
> > 
> >       
> >     
> > 
> >     
> >   
> > Other tests (e.g. [arithmetic_test.cc](https://github.com/google/highway/blob/master/hwy/tests/arithmetic_test.cc)) also crash with an identical stack, but with no assertion.
> 
> Any build command?

Hopefully better: a direct IR repro (this one is reduced from [zstd_opt.c](https://github.com/facebook/zstd/blob/dev/lib/compress/zstd_opt.c)):

```llvm
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%struct.ZSTD_optimal_t = type { i32, i32, i32, i32, [3 x i32] }

define i64 @ZSTD_compressBlock_opt_generic(ptr %ms, i32 %0, i32 %lastStretch.sroa.3.0.copyload) {
entry:
  %1 = load ptr, ptr %ms, align 8
  %sub506 = sub i32 %0, %lastStretch.sroa.3.0.copyload
  %add537 = add i32 %sub506, 1
  %idxprom556 = zext i32 %add537 to i64
  %arrayidx5573 = getelementptr %struct.ZSTD_optimal_t, ptr %1, i64 %idxprom556
  store i32 0, ptr %arrayidx5573, align 4
  %idxprom563.phi.trans.insert = zext i32 %sub506 to i64
  %arrayidx564.phi.trans.insert = getelementptr %struct.ZSTD_optimal_t, ptr %1, i64 %idxprom563.phi.trans.insert
  %.pre = load i8, ptr %arrayidx564.phi.trans.insert, align 1
  br label %while.body562

while.body562:                                    ; preds = %while.body562, %entry
  store i8 %.pre, ptr %ms, align 1
  br label %while.body562
}
```

Repros w/ `opt -S --passes=slp-vectorizer`

https://github.com/llvm/llvm-project/pull/127450


More information about the llvm-commits mailing list