dtcxzyw wrote: > as introduced by the loop vectorizer. I guess it is intended to fulfill the pipeline? Imagine the CPU has multiple ports/pipelines executing the same kind of instructions (load/fadd/fmul). https://github.com/llvm/llvm-project/pull/143878