[PATCH] D148855: [SLP]Improve tryToGatherExtractElements by using per-register analysis.

Martin Storsjö via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 2 03:07:57 PDT 2023


mstorsjo added a comment.

This seems to have caused a misoptimization in ffmpeg for aarch64.

To reproduce, you can follow these steps, on aarch64 Linux:

  $ git clone https://github.com/ffmpeg/ffmpeg
  $ mkdir ffmpeg-build
  $ cd ffmpeg-build
  $ ../ffmpeg/configure --cc=clang --samples=$(pwd)/../fate-samples
  $ make fate-rsync
  $ make -j$(nproc) fate-vp9-00-quantizer-18

The misoptimized object file is `libavcodec/vp9dsp_8bpp.o`.

The standalone preprocessed input for that object file is available at https://martin.st/temp/vp9dsp_8bpp-preproc.c, you can reproduce the misoptimization with `clang -target aarch64-linux-gnu -c -O3 vp9dsp_8bpp-preproc.c -o vp9dsp_8bpp.o`.

Can you look into this, and possibly revert if fixing takes some time?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148855/new/

https://reviews.llvm.org/D148855



More information about the llvm-commits mailing list