[PATCH] D24955: [ValueTracking] Teach computeKnownBits and ComputeNumSignBits to look through ExtractElement.
Sanjay Patel via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 27 10:26:53 PDT 2016
spatel added a comment.
The idea looks good to me, but we should increase the depth for the recursive call. The reason that depth is not incremented for the ExtractValue case is that computeKnownBitsAddSub / computeKnownBitsMul are helper functions (they are not recursing on themselves), so the depth is incremented internally when calling computeKnownBits.
================
Comment at: test/Analysis/ValueTracking/signbits-extract-elt.ll:1
@@ +1,2 @@
+; RUN: opt < %s -instcombine -S | FileCheck %s
+
----------------
Use -instsimplify instead.
================
Comment at: test/Analysis/ValueTracking/signbits-extract-elt.ll:7-9
@@ +6,5 @@
+; the addition is nsw.
+; (test case may seem overly complicated, but without two extractelement + add
+; this would be scalarized without testing computeKnownBits which is the
+; purpopse with the test).
+define i1 @test1(<4 x i16>* %in) {
----------------
The test case is overly complicated because something in -instcombine is able to reduce a simplified version of the test. If you change the test to use -instsimplify, then this test will prove that your ValueTracking change is firing:
define i1 @computeKnownBits_look_through_extractelt(<2 x i8> %vecin) {
%vec = zext <2 x i8> %vecin to <2 x i32>
%elt1 = extractelement <2 x i32> %vec, i32 1
%bool = icmp slt i32 %elt1, 0
ret i1 %bool
}
================
Comment at: test/Analysis/ValueTracking/signbits-extract-elt.ll:23-38
@@ +22,17 @@
+
+; This is to verify that computeKnownSignBits is doing a simple look-thru for
+; extractelement. It is detected as the shift of %elt0 becoming "nsw".
+define i32 @test2(<4 x i16>* %in) {
+; CHECK-LABEL: @test2(
+; CHECK: %tmp1 = shl nsw i32 %elt0, 18
+; CHECK: %tmp2 = shl i32 %elt1, 19
+ %vec2 = load <4 x i16>, <4 x i16>* %in, align 1
+ %vec3 = ashr <4 x i16> %vec2, <i16 2, i16 2, i16 2, i16 2>
+ %vec4 = sext <4 x i16> %vec3 to <4 x i32>
+ %elt0 = extractelement <4 x i32> %vec4, i32 0
+ %elt1 = extractelement <4 x i32> %vec4, i32 1
+ %tmp1 = shl i32 %elt0, 18
+ %tmp2 = shl i32 %elt1, 19
+ %r1 = add i32 %tmp1, %tmp2
+ ret i32 %r1
+}
----------------
We have to dig deeper to find an instsimplify fold that works based on ComputeNumSignBits, but I think this will do it:
define i32 @computeNumSignBits_look_through_extractelt(<2 x i1> %vec) {
%vec4 = sext <2 x i1> %vec to <2 x i32>
%elt0 = extractelement <2 x i32> %vec4, i32 0
%ashr = ashr i32 %elt0, 5 <--- this will disappear after this patch is applied
ret i32 %ashr
}
https://reviews.llvm.org/D24955
More information about the llvm-commits
mailing list