[PATCH] D148855: [SLP]Improve tryToGatherExtractElements by using per-register analysis.
Alexey Bataev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 2 03:55:34 PDT 2023
ABataev added a comment.
In D148855#4655968 <https://reviews.llvm.org/D148855#4655968>, @mstorsjo wrote:
> This seems to have caused a misoptimization in ffmpeg for aarch64.
>
> To reproduce, you can follow these steps, on aarch64 Linux:
>
> $ git clone https://github.com/ffmpeg/ffmpeg
> $ mkdir ffmpeg-build
> $ cd ffmpeg-build
> $ ../ffmpeg/configure --cc=clang --samples=$(pwd)/../fate-samples
> $ make fate-rsync
> $ make -j$(nproc) fate-vp9-00-quantizer-18
>
> The misoptimized object file is `libavcodec/vp9dsp_8bpp.o`.
>
> The standalone preprocessed input for that object file is available at https://martin.st/temp/vp9dsp_8bpp-preproc.c, you can reproduce the misoptimization with `clang -target aarch64-linux-gnu -c -O3 vp9dsp_8bpp-preproc.c -o vp9dsp_8bpp.o`.
>
> Can you look into this, and possibly revert if fixing takes some time?
Hi, thanks for the report. Generally speaking, this change does he same, what InstCombiner does with extractelement/insertelement sequences. I'll check what's the cause. Do not know the reason yet, but most probably either TTI cost model problem, or codegen (lowering) problem.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D148855/new/
https://reviews.llvm.org/D148855
More information about the llvm-commits
mailing list