[PATCH] D24955: [ValueTracking] Teach computeKnownBits and ComputeNumSignBits to look through ExtractElement.

Tue Sep 27 10:26:53 PDT 2016

spatel added a comment.

The idea looks good to me, but we should increase the depth for the recursive call. The reason that depth is not incremented for the ExtractValue case is that computeKnownBitsAddSub / computeKnownBitsMul are helper functions (they are not recursing on themselves), so the depth is incremented internally when calling computeKnownBits.

================
Comment at: test/Analysis/ValueTracking/signbits-extract-elt.ll:1
@@ +1,2 @@
+; RUN: opt < %s -instcombine -S | FileCheck %s
+
----------------
Use -instsimplify instead.

================
Comment at: test/Analysis/ValueTracking/signbits-extract-elt.ll:7-9
@@ +6,5 @@
+; the addition is nsw.
+; (test case may seem overly complicated, but without two extractelement + add
+; this would be scalarized without testing computeKnownBits which is the
+; purpopse with the test).
+define i1 @test1(<4 x i16>* %in) {
----------------
The test case is overly complicated because something in -instcombine is able to reduce a simplified version of the test. If you change the test to use -instsimplify, then this test will prove that your ValueTracking change is firing:

  define i1 @computeKnownBits_look_through_extractelt(<2 x i8> %vecin) {
    %vec = zext <2 x i8> %vecin to <2 x i32>
    %elt1 = extractelement <2 x i32> %vec, i32 1
    %bool = icmp slt i32 %elt1, 0
    ret i1 %bool
  }

================
Comment at: test/Analysis/ValueTracking/signbits-extract-elt.ll:23-38
@@ +22,17 @@
+
+; This is to verify that computeKnownSignBits is doing a simple look-thru for
+; extractelement. It is detected as the shift of %elt0 becoming "nsw".
+define i32 @test2(<4 x i16>* %in) {
+; CHECK-LABEL: @test2(
+; CHECK:    %tmp1 = shl nsw i32 %elt0, 18
+; CHECK:    %tmp2 = shl i32 %elt1, 19
+  %vec2 = load <4 x i16>, <4 x i16>* %in, align 1
+  %vec3 = ashr <4 x i16> %vec2, <i16 2, i16 2, i16 2, i16 2>
+  %vec4 = sext <4 x i16> %vec3 to <4 x i32>
+  %elt0 = extractelement <4 x i32> %vec4, i32 0
+  %elt1 = extractelement <4 x i32> %vec4, i32 1
+  %tmp1 = shl i32 %elt0, 18
+  %tmp2 = shl i32 %elt1, 19
+  %r1 = add i32 %tmp1, %tmp2
+  ret i32 %r1
+}
----------------
We have to dig deeper to find an instsimplify fold that works based on ComputeNumSignBits, but I think this will do it:
  define i32 @computeNumSignBits_look_through_extractelt(<2 x i1> %vec) {
    %vec4 = sext <2 x i1> %vec to <2 x i32>
    %elt0 = extractelement <2 x i32> %vec4, i32 0
    %ashr = ashr i32 %elt0, 5  <--- this will disappear after this patch is applied
    ret i32 %ashr
  }

https://reviews.llvm.org/D24955