[PATCH] D124878: [Bitcode] Include indirect users of BlockAddresses in bitcode

Duncan P. N. Exon Smith via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 4 15:27:00 PDT 2022


dexonsmith added a comment.

At a high-level, the design for writing bitcode is a two-pass traversal algorithm:

- First pass in ValueEnumerator indexes everything
- Second pass in BitcodeWriter writes things

It feels like this new logic could be put into the first pass in ValueEnumerator. (Probably the previous patch, which this builds on, should have done that as well, to avoid iterating over `users()` in BitcodeWriter, but I guess I didn't notice!)

I think you add some bookkeeping to `ValueEnumerator::EnumerateValue()`. It currently has:

  if (const Constant *C = dyn_cast<Constant>(V)) {
    if (isa<GlobalValue>(C)) {
      // Initializers for globals are handled explicitly elsewhere.
    } else if (C->getNumOperands()) {
      for (User::const_op_iterator I = C->op_begin(), E = C->op_end();
           I != E; ++I)
        if (!isa<BasicBlock>(*I)) // Don't enumerate BB operand to BlockAddress.
          EnumerateValue(*I);

you could change this to something like:

  if (const Constant *C = dyn_cast<Constant>(V)) {
    if (isa<GlobalValue>(C)) {
      // Initializers for globals are handled explicitly elsewhere.
    } else if (C->getNumOperands()) {
      SmallSetVector<const BasicBlock *, 4> FoundBBs;
      for (User::const_op_iterator I = C->op_begin(), E = C->op_end();
           I != E; ++I) {
        // Don't enumerate BB operand to BlockAddress, but do track them.
        if (const BasicBlock *BB = dyn_cast<BasicBlock>(&*I)) {
          FoundBBs.insert(BB);
          continue;
        }
        EnumerateValue(*I);
        auto I = BlockAddressesForConstant.find(&cast<Constant>(*I));
        if (I != BlockAddressesForConstant())
          FoundBBs.insert(I->begin(), I->end());
      }
      if (!BlockAddresses.empty())
        BlockAddressesForConstant[C].assign(FoundBBs.begin(), FoundBBs.end());

where ValueEnumerator has a new thing:

  SmallDenseMap<const Constant *, SmallVector<const BasicBlock *, 0>> BlockAddressesForConstant;

Then you can change `incorporateFunction`, which currently has:

  // Add all function-level constants to the value table.
  for (const BasicBlock &BB : F) {
    for (const Instruction &I : BB) {
      for (const Use &OI : I.operands()) {
        if ((isa<Constant>(OI) && !isa<GlobalValue>(OI)) || isa<InlineAsm>(OI))
          EnumerateValue(OI);
      }
      if (auto *SVI = dyn_cast<ShuffleVectorInst>(&I))
        EnumerateValue(SVI->getShuffleMaskForBitcode());
    }
    BasicBlocks.push_back(&BB);
    ValueMap[&BB] = BasicBlocks.size();
  }

to be:

  // Add all function-level constants to the value table.
  SmallSetVector<const BasicBlock *> FoundBlockAddresses;
  for (const BasicBlock &BB : F) {
    for (const Instruction &I : BB) {
      for (const Use &OI : I.operands()) {
        if (isa<GlobalValue>(OI))
          continue;
        auto *C = dyn_cast<Constant>(OI);
        if (!C && !isa<InlineAsm>(OI))
          continue;
        EnumerateValue(OI);
        if (!C)
          continue;
        auto BBs = BlockAddressesForConstant.find(C);
        if (BBs != BlockAddressesForConstant.end())
          FoundBlockAddresses.append(BBs->begin(), BBs->end());
      }
      if (auto *SVI = dyn_cast<ShuffleVectorInst>(&I))
        EnumerateValue(SVI->getShuffleMaskForBitcode());
    }
    BasicBlocks.push_back(&BB);
    ValueMap[&BB] = BasicBlocks.size();
  }
  
  if (!FoundBlockAddresses.empty()) {
    // Filter out ones that are local.
    SmallVector<const BasicBlock *> ForeignBlockAddresses;
    llvm::copy_if(FoundBlockAddresses, [&](const BasicBlock *BB) { return BB->getOwner() != &F; },
                  std::back_inserter(ForeignBlockAddresses));
    if (!ForeignBlockAddresses.empty())
      FunctionBlockAddresses[&F] = std::move(ForeignBlockAddresses);
  }

where ValueEnumerator has:

  SmallDenseMap<const Function *, SmallVector<const BasicBlock *, 0>> FunctionBlockAddresses;

Then `BitcodeWriter` has no need to walk upwards through `users()`. It can just look at `FunctionBlockAddresses`.



================
Comment at: llvm/lib/Bitcode/Reader/BitcodeReader.cpp:3095
     }
-    case bitc::CST_CODE_BLOCKADDRESS:{
+    case bitc::CST_CODE_BLOCKADDRESS: {
       if (Record.size() < 3)
----------------
Nit: please commit any unrelated whitespace changes in separate NFC patches.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124878/new/

https://reviews.llvm.org/D124878



More information about the llvm-commits mailing list