[PATCH] Introduce bitset metadata format and bitset lowering pass.

Thu Feb 19 14:59:25 PST 2015

================
Comment at: lib/Transforms/IPO/LowerBitSets.cpp:236
@@ +235,3 @@
+                                      Value *BitOffset) {
+  if (BSI.Bits.size() <= 8) {
+    // If the bit set is sufficiently small, we can avoid a load by bit testing
----------------
jfb wrote:
> pcc wrote:
> > jfb wrote:
> > > pcc wrote:
> > > > jfb wrote:
> > > > > I'm mildly disappointed that the optimizer doesn't do this by taking into account ISA-specific sizes (and then removing the dead global because its address isn't taken).
> > > > This should in principle be possible, but this pass runs late so in any case it seems best to directly generate the IR we need.
> > > Does it need to run that late? 8 may not be the right number on all architectures, and you should see a good binary size reduction by GC'ing unreferenced globals (especially if CFI+devirtualization occurs).
> > It seemed better to run the pass later because it splits basic blocks, which could pessimize things in other passes, and bitset creation essentially locks in the set of virtual tables that appear in the binary, preventing them from being GCd.
> > 
> > I'm not sure what the best way to deal with this might be. We might later want to split this pass into an early pass and a late pass, where the early pass uses bitset metadata to do devirtualization, a later globalopt pass GCs unused vtables and the late pass builds the actual bitsets.
> Maybe I have a naive view, but shouldn't bitsets be GC'able if no test refers to them? This doesn't need to be an optimization that's specific to your code, LLVM can do this in general when a global doesn't escape and isn't address-taken (and in your case, is read-only). If this is correct, then I don't think you need to split up this pass, though I agree that you may want to do devirtualization earlier to expose more optimization opportunities.
> 
> Under the current setup, do redundant tests in the same function get eliminated and control flow merged?
> 
> This may be something that we can leave open for later changes: I think the current code is good in that it does what's required and is pretty efficient at it. I don't think the design will change substantially, but I do think there are further optimization opportunities here. WDYT?
> Maybe I have a naive view, but shouldn't bitsets be GC'able if no test refers to them?

They could be, but the globals that the bitsets map onto (i.e. the vtables) cannot be GC'd because we lay them out in a specific order in this pass.

We only build bitset constants for bitsets that are referred to by tests. The loop near the start of `LowerBitSets::buildBitSets` identifies all such bitsets by looking through the uses of the `llvm.bitset.test` intrinsic. If a particular test is dead, LLVM should equally be able to remove the dead test (as it is a readonly intrinsic) or remove a dead load from a bitset as part of DCE (in fact the former would probably be easier because of the simpler control flow).

Another advantage of doing this late is that allowing the earlier passes to eliminate dead tests we potentially reduce the number of equivalence classes we need to create, which could result in smaller disjoint sets of classes and therefore smaller bitsets.

> Under the current setup, do redundant tests in the same function get eliminated and control flow merged?

Are you referring to cases where a virtual call happens twice through the same pointer?

```
struct S {
  virtual void f();
};

[...]
S *p = ...;
p->f();
p->f();
```

The problem is that it will be difficult to remove redundant tests because of the semantics of C++. In this case the function f could overwrite the memory region that p refers to with an object of a different derived class without invoking undefined behavior. We might want a flag that a user can use to promise that such things will never happen though.

> I don't think the design will change substantially, but I do think there are further optimization opportunities here. WDYT?

At a high level I do agree that there are optimization opportunities to pursue here. (I could elaborate, but this probably isn't the place.)

http://reviews.llvm.org/D7288

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/