[llvm] [VPlan] Fix mutating whilst iterating over users in EVL transform (PR #122885)
Luke Lau via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 15 01:59:44 PST 2025
lukel97 wrote:
> I'm more curious about how you extracted miscompilation extracted from 525.x264_r
I first started with 502.gcc_r because it exits much more quickly when there's a miscompile. Then I used `llvm/utils/rsp_bisect.py` comparing a non-EVL build and an EVL build:
```bash
#!/bin/bash
./build/bin/clang --target=riscv64-linux-gnu -march=rva22u64_v -O3 -fuse-ld=lld -o 502.gcc_r @502.gcc_r.rsp
# hacky: a good build will timeout, a miscompile will exit within 15 seconds
timeout 15 qemu-riscv64 -cpu rv64,v=on,vext_spec=v1.0 ./502.gcc_r 200.c -O3 -finline-limit=50000 -o /tmp/200.c.o
if [ $? -eq 124 ]
then
exit 0
else
exit 1
fi
# bisect with ./llvm/utils/rsp_bisect.py --test=502.gcc_r-rsp-bisect.sh --rsp=502.gcc_r.rsp --other-rel-path=../build.rva22u64_v-evl-O3
```
This reduced the miscompile down to the changes in one file, reload1.c.
>From there I diffed the output assembly and noticed that there were some loops with what looked like to be reversed pointer vectors, so I changed `VPlanTransforms::tryAddExplicitVectorLength` to bail if it encountered any recipes with VPReverseVectorPointerRecipe.
Removing VPReverseVectorPointerRecipe support fixed 502.gcc_r, so I knew it lied somewhere there.
But the diff was still quite big in 502.gcc_r, so I diffed 525.x264_r with and without VPReverseVectorPointerRecipes, which was much easier to follow. There was only one diff and it was in a loop in the function `quant_trellis_cabac`.
There I was able to extract the bit of C from the sources that generated that loop, and after staring at the output for a while I noticed the error.
It took me a while to find this, but I hope that helps! It's much easier to debug miscompiles if they can be git-bisected back, but this wasn't the case unfortunately :)
https://github.com/llvm/llvm-project/pull/122885
More information about the llvm-commits
mailing list