[PATCH] D118979: [AArch64] Set maximum VF with shouldMaximizeVectorBandwidth
JinGu Kang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 4 06:32:48 PDT 2022
jaykang10 added inline comments.
================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-illegal-type.ll:90
; CHECK: store i1 %[[EXTRACT1]], i1* %dst
-; CHECK: %[[EXTRACT2:.*]] = extractelement <2 x i1> %[[ICMP]], i32 1
+; CHECK: %[[EXTRACT2:.*]] = extractelement <64 x i1> %[[ICMP]], i32 1
; CHECK: store i1 %[[EXTRACT2]], i1* %dst
----------------
jaykang10 wrote:
> dmgreen wrote:
> > This is worrying - should it be vectorizing 64x for in i1 type! (and are there a lot of other extracts now)?
> When I checked it, it looked the dagcombiner combines the 64 times i1 extract_vector_elt and store nodes to one 64 bit store node.
> Let me check it again.
um... in this test, the `%dst` is passed as parameter so it is not changed in the loop. Therefore, the last element of <64 x i1>vector needs to be stored. It looks dagcombiner catches it and optimizes the nodes well. The assembly output of `vector.body` block from llc is as below. It looks ok.
```
.LBB0_3: // %vector.body
// =>This Inner Loop Header: Depth=1
dup v2.2d, x12
add x12, x12, #512
subs x11, x11, #64
add v2.2d, v2.2d, v0.2d
cmeq v2.2d, v2.2d, v1.2d
xtn2 v2.4s, v2.2d
xtn2 v2.8h, v2.4s
xtn v2.8b, v2.8h
umov w13, v2.b[7]
and w13, w13, #0x1
strb w13, [x0]
b.ne .LBB0_3
```
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D118979/new/
https://reviews.llvm.org/D118979
More information about the llvm-commits
mailing list