[PATCH] D118979: [AArch64] Set maximum VF with shouldMaximizeVectorBandwidth

JinGu Kang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 4 06:32:48 PDT 2022


jaykang10 added inline comments.


================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-illegal-type.ll:90
 ; CHECK: store i1 %[[EXTRACT1]], i1* %dst
-; CHECK: %[[EXTRACT2:.*]] = extractelement <2 x i1> %[[ICMP]], i32 1
+; CHECK: %[[EXTRACT2:.*]] = extractelement <64 x i1> %[[ICMP]], i32 1
 ; CHECK: store i1 %[[EXTRACT2]], i1* %dst
----------------
jaykang10 wrote:
> dmgreen wrote:
> > This is worrying - should it be vectorizing 64x for in i1 type! (and are there a lot of other extracts now)?
> When I checked it, it looked the dagcombiner combines the 64 times i1 extract_vector_elt and store nodes to one 64 bit store node.
> Let me check it again.
um... in this test, the `%dst` is passed as parameter so it is not changed in the loop. Therefore, the last element of <64 x i1>vector needs to be stored. It looks dagcombiner catches it and optimizes the nodes well. The assembly output of `vector.body` block from llc is as below. It looks ok.
```
.LBB0_3:                                // %vector.body
                                        // =>This Inner Loop Header: Depth=1
	dup	v2.2d, x12
	add	x12, x12, #512
	subs	x11, x11, #64
	add	v2.2d, v2.2d, v0.2d
	cmeq	v2.2d, v2.2d, v1.2d
	xtn2	v2.4s, v2.2d
	xtn2	v2.8h, v2.4s
	xtn	v2.8b, v2.8h
	umov	w13, v2.b[7]
	and	w13, w13, #0x1
	strb	w13, [x0]
	b.ne	.LBB0_3
```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118979/new/

https://reviews.llvm.org/D118979



More information about the llvm-commits mailing list