[PATCH] D83135: [VectorCombine] Narrow ZExt that feed binop followed by trunc.
Florian Hahn via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jul 3 09:06:50 PDT 2020
fhahn created this revision.
fhahn added reviewers: spatel, RKSimon, lebedev.ri, xbolva00.
Herald added subscribers: hiraditya, kristof.beyls.
Herald added a project: LLVM.
fhahn updated this revision to Diff 275408.
fhahn added a comment.
Move llvm/test/Transforms/VectorCombine/AArch64/lit.local.cfg to NFC test patch.
In the pattern below, the trunc can be eliminated by shortening the
zexts, if the zexts remain.
trunc (binop (zext), (zext)) to ty -> binop (zext to ty) (zext to ty)
Initially limited to add/sub.
This transform is only performed if the shortened zexts are free (can be
folded into the binary op).
I am not entirely sure VectorCombine is the right place to do the
transform, but I think we want to limit it to cases where we know the
shorter zexts are free/legal on the target. I am not sure if we have an
easy way to check the latter though.
Alive proof sketches (scalar versions so we do not run into timeouts):
- add: https://alive2.llvm.org/ce/z/DgABb-
- add nuw: https://alive2.llvm.org/ce/z/yx5Vag
- add nsw: https://alive2.llvm.org/ce/z/yyVoRU
- sub: https://alive2.llvm.org/ce/z/bKj22_
- sub nuw: https://alive2.llvm.org/ce/z/H8soWR
- sub nsw: https://alive2.llvm.org/ce/z/QLVNDK
On AArch64, codegen for the following input can be improved (this is
from hot code in SPEC2006/h264)
define <8 x i32> @test(<8 x i16>* %p1, <8 x i16>* %p2) {
%l.1 = load <8 x i16>, <8 x i16>* %p1, align 2
%ext.1 = zext <8 x i16> %l.1 to <8 x i64>
%l.2 = load <8 x i16>, <8 x i16>* %p2, align 2
%ext.2 = zext <8 x i16> %l.2 to <8 x i64>
%sub = sub nsw <8 x i64> %ext.1, %ext.2
%t = trunc <8 x i64> %sub to <8 x i32>
ret <8 x i32> %t
}
Without patch
ldr q0, [x0]
ldr q1, [x1]
ushll2 v2.4s, v0.8h, #0
ushll v0.4s, v0.4h, #0
ushll2 v3.4s, v1.8h, #0
ushll v1.4s, v1.4h, #0
usubl2 v4.2d, v0.4s, v1.4s
usubl v0.2d, v0.2s, v1.2s
usubl v1.2d, v2.2s, v3.2s
usubl2 v5.2d, v2.4s, v3.4s
xtn v1.2s, v1.2d
xtn v0.2s, v0.2d
xtn2 v1.4s, v5.2d
xtn2 v0.4s, v4.2d
ret
With patch
ldr q0, [x0]
ldr q2, [x1]
usubl2 v1.4s, v0.8h, v2.8h
usubl v0.4s, v0.4h, v2.4h
ret
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D83135
Files:
llvm/lib/Transforms/Vectorize/VectorCombine.cpp
llvm/test/Transforms/VectorCombine/AArch64/shorten-extend-if-free.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D83135.275408.patch
Type: text/x-patch
Size: 5088 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200703/26747ae2/attachment.bin>
More information about the llvm-commits
mailing list