[PATCH] D93967: [SLP]Need shrink the load vector after reordering.
Martin Storsjö via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 8 00:35:04 PST 2021
mstorsjo added a comment.
This caused misoptimizations for armv7, where code that previously worked correctly now produce different results. (The code is clean under ubsan, so it shouldn't be relying on anything undefined.)
The issue appears with https://martin.st/temp/interplayvideo-preproc.c, compiled with `clang -target armv7-linux-gnueabihf -O2`.
The diff in generated code, before/after, looks like this:
vmov.32 d16[0], lr
vmov.32 d16[1], r2
.LBB27_3: @ %if.end
@ in Loop: Header=BB27_4 Depth=1
vmov.32 r2, d16[1]
add r3, r3, #1
vmov.32 r5, d16[0]
cmp r3, #8
+ vdup.32 d16, d16[0]
vmov.16 d17[1], r2
vmov.16 d18[0], r5
vdup.16 d21, d17[1]
vdup.16 d20, d18[0]
vst1.16 {d20, d21}, [r1], r12
beq .LBB27_9
.LBB27_4: @ %for.body
@ =>This Inner Loop Header: Depth=1
tst r3, #3
bne .LBB27_3
If it loops back to `.LBB27_3`, the vector element `d16[1]` no longer has the value it was expected to have.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D93967/new/
https://reviews.llvm.org/D93967
More information about the llvm-commits
mailing list