[llvm] 5bf0023 - [GISel][KnownBits] Update a comment regarding the effect of cache on PHIs

Tue Feb 25 15:56:32 PST 2020

Author: Quentin Colombet
Date: 2020-02-25T15:56:15-08:00
New Revision: 5bf0023b0d706bb8817f1e58145996cccc2f2c58

URL: https://github.com/llvm/llvm-project/commit/5bf0023b0d706bb8817f1e58145996cccc2f2c58
DIFF: https://github.com/llvm/llvm-project/commit/5bf0023b0d706bb8817f1e58145996cccc2f2c58.diff

LOG: [GISel][KnownBits] Update a comment regarding the effect of cache on PHIs

Unlike what I claimed in my previous commit. The caching is
actually not NFC on PHIs.

When we put a big enough max depth, we end up simulating loops.
The cache is effectively cutting the simulation short and we
get less information as a result.
E.g.,
```
v0 = G_CONSTANT i8 0xC0
jump
v1 = G_PHI i8 v0, v2
v2 = G_LSHR i8 v1, 1
```

Let say we want the known bits of v1.
- With cache:
Set v1 cache to we know nothing
v1 is v0 & v2
v0 gives us 0xC0
v2 gives us known bits of v1 >> 1
v1 is in the cache
=> v1 is 0, thus v2 is 0x80
Finally v1 is v0 & v2 => 0x80

- Without cache and enough depth to do two iteration of the loop:
v1 is v0 & v2
v0 gives us 0xC0
v2 gives us known bits of v1 >> 1
v1 is v0 & v2
v0 is 0xC0
v2 is v1 >> 1
Reach the max depth for v1...
unwinding
v1 is know nothing
v2 is 0x80
v0 is 0xC0
v1 is 0x80
v2 is 0xC0
v0 is 0xC0
v1 is 0xC0

Thus now v1 is 0xC0 instead of 0x80.

I've added a unittest demonstrating that.

NFC

Added: 
    

Modified: 
    llvm/lib/CodeGen/GlobalISel/GISelKnownBits.cpp
    llvm/unittests/CodeGen/GlobalISel/KnownBitsTest.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/lib/CodeGen/GlobalISel/GISelKnownBits.cpp b/llvm/lib/CodeGen/GlobalISel/GISelKnownBits.cpp
index a2b251328c59..8e369fe9e31d 100644

--- a/llvm/lib/CodeGen/GlobalISel/GISelKnownBits.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/GISelKnownBits.cpp
@@ -162,10 +162,12 @@ void GISelKnownBits::computeKnownBitsImpl(Register R, KnownBits &Known,
     // Record in the cache that we know nothing for MI.
     // This will get updated later and in the meantime, if we reach that
     // phi again, because of a loop, we will cut the search thanks to this
-    // cache entry. When this happens this cache entry is actually accurate,
-    // thus we are not losing anything by doing that, because right now,
-    // the main analysis will reach the maximum depth without being able
-    // to fully analyze the phi.
+    // cache entry.
+    // We could actually build up more information on the phi by not cutting
+    // the search, but that additional information is more a side effect
+    // than an intended choice.
+    // Therefore, for now, save on compile time until we derive a proper way
+    // to derive known bits for PHIs within loops.
     ComputeKnownBitsCache[R] = KnownBits(BitWidth);
     // PHI's operand are a mix of registers and basic blocks interleaved.
     // We only care about the register ones.

diff  --git a/llvm/unittests/CodeGen/GlobalISel/KnownBitsTest.cpp b/llvm/unittests/CodeGen/GlobalISel/KnownBitsTest.cpp
index f75de5c22593..a5372511b051 100644
--- a/llvm/unittests/CodeGen/GlobalISel/KnownBitsTest.cpp
+++ b/llvm/unittests/CodeGen/GlobalISel/KnownBitsTest.cpp
@@ -198,6 +198,51 @@ TEST_F(GISelMITest, TestKnownBitsCstPHIWithLoop) {
   EXPECT_EQ(Res.Zero.getZExtValue(), Res2.Zero.getZExtValue());
 }
 
+// Check that we don't try to analysis PHIs progression.
+// Setting a deep enough max depth would allow to effectively simulate
+// what happens in the loop.
+// Thus, with a deep enough depth, we could actually figure out
+// that %14's zero known bits are actually at least what we know
+// for %10, right shifted by one.
+// However, this process is super expensive compile-time wise and
+// we don't want to reach that conclusion while playing with max depth.
+// For now, the analysis just stops and assumes it knows nothing
+// on PHIs, but eventually we could teach it how to properly track
+// phis that loop back without relying on the luck effect of max
+// depth.
+TEST_F(GISelMITest, TestKnownBitsDecreasingCstPHIWithLoop) {
+  StringRef MIRString = "  bb.10:\n"
+                        "  %10:_(s8) = G_CONSTANT i8 5\n"
+                        "  %11:_(s8) = G_CONSTANT i8 1\n"
+                        "\n"
+                        "  bb.12:\n"
+                        "  %13:_(s8) = PHI %10(s8), %bb.10, %14(s8), %bb.12\n"
+                        "  %14:_(s8) = G_LSHR %13, %11\n"
+                        "  %15:_(s8) = COPY %14\n"
+                        "  G_BR %bb.12\n";
+  setUp(MIRString);
+  if (!TM)
+    return;
+  Register CopyReg = Copies[Copies.size() - 1];
+  MachineInstr *FinalCopy = MRI->getVRegDef(CopyReg);
+  Register SrcReg = FinalCopy->getOperand(1).getReg();
+  Register DstReg = FinalCopy->getOperand(0).getReg();
+  GISelKnownBits Info(*MF, /*MaxDepth=*/24);
+  KnownBits Res = Info.getKnownBits(SrcReg);
+  EXPECT_EQ((uint64_t)0, Res.One.getZExtValue());
+  // A single iteration on the PHI (%13) gives:
+  // %10 has known zero of 0xFA
+  // %12 has known zero of 0x80 (we shift right by one so high bit is zero)
+  // Therefore, %14's known zero are 0x80 shifted by one 0xC0.
+  // If we had simulated the loop we could have more zero bits, basically
+  // up to 0xFC (count leading zero of 5, + 1).
+  EXPECT_EQ((uint64_t)0xC0, Res.Zero.getZExtValue());
+
+  KnownBits Res2 = Info.getKnownBits(DstReg);
+  EXPECT_EQ(Res.One.getZExtValue(), Res2.One.getZExtValue());
+  EXPECT_EQ(Res.Zero.getZExtValue(), Res2.Zero.getZExtValue());
+}
+
 TEST_F(GISelMITest, TestKnownBitsPtrToIntViceVersa) {
   StringRef MIRString = "  %3:_(s16) = G_CONSTANT i16 256\n"
                         "  %4:_(p0) = G_INTTOPTR %3\n"