[PATCH] D12457: [Bitcode][Asm] Teach LLVM to read and write operand bundles.

Duncan P. N. Exon Smith via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 10 17:05:29 PDT 2015


> On 2015-Sep-08, at 21:56, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
> 
> sanjoy updated this revision to Diff 34299.
> sanjoy added a comment.
> 
> - add some tests to compatibility.ll
> 
>  Note: these do not yet pass but as far as I can tell, the issue is unrelated to operand bundles -- see PR24755.

I think you can disable the `verify-uselistorder` run lines for now.
Something like:
--
RUN-PR24755: ...
--
makes sense to me (and then please document this in the PR so that we
remember to enable the tests when the PR gets fixed).

> 
> 
> http://reviews.llvm.org/D12457
> 
> Files:
>  docs/LangRef.rst
>  include/llvm/Bitcode/LLVMBitCodes.h
>  lib/AsmParser/LLParser.cpp
>  lib/AsmParser/LLParser.h
>  lib/Bitcode/Reader/BitcodeReader.cpp
>  lib/Bitcode/Writer/BitcodeWriter.cpp
>  lib/IR/AsmWriter.cpp
>  test/Bitcode/compatibility.ll
>  test/Bitcode/operand-bundles.ll
>  test/Verifier/operand-bundles.ll
> 
> <D12457.34299.patch>

This uses a lot of new bitcode records.  I think you can do it with just
one new function record, and simplify some of the logic in the process.

Firstly, you can avoid the new forms of CALL/INVOKE by emitting the
BUNDLE records *before* the instructions instead of after.  The reader
logic would be something like this:
--
SmallVector<Bundle, 8> Bundles;
while (1) {
  Bitcode = ...;

  // Emit an error if bundles attached to invalid instruction.
  if (!Bundles.empty() &&
      Bitcode != BUNDLE &&
      Bitcode != CALL &&
      Bitcode != INVOKE)
    return Error("Bundles attached to invalid instruction");

  // Old switch statement to read records.
  switch (Bitcode) {
  case BUNDLE:
    Bundles.emplace_back(...);
    break;
  case CALL:
    // ...
    ... = CreateCall(..., Bundles);
    break;
  case ...
  }
}
--
In the writer, you just emit the bundles before emitting the rest of the
instruction to accomplish this.

Secondly, now that the bundles precede the instruction they're being
attached to, you don't need the BUNDLES_HEADER record.  The number of
bundles is recorded implicitly.

Thirdly, you can merge the BUNDLE_TAG and BUNDLE_CONTENTS records by
adding a field for the number of values.
--
// BUNDLE: [ V, V x vals, L x char ]
--

Even better: I assume a module is only likely to have a small-ish set of
bundle tags that are reused?  If so, you can add a BUNDLE_BLOCK (or
some such block) that defines the tags, and emit it at the top of the
module:
--
// BUNDLE_TAG: [ n x strchr ]
--
This implicitly numbers the tags by the order of the records.

One benefit is that each tag is only serialized once, but this also
simplifies (and shrinks) the BUNDLE records, which can refer to the tags
by ID.
--
// BUNDLE: [ tag, n x vals ]
--

A few other comments below (mostly just pointing out how logic could be
simplified, and a few nitpicks in the LangRef).

> Index: lib/Bitcode/Writer/BitcodeWriter.cpp
> ===================================================================
> --- lib/Bitcode/Writer/BitcodeWriter.cpp
> +++ lib/Bitcode/Writer/BitcodeWriter.cpp
> @@ -16,6 +16,7 @@
>  #include "llvm/ADT/Triple.h"
>  #include "llvm/Bitcode/BitstreamWriter.h"
>  #include "llvm/Bitcode/LLVMBitCodes.h"
> +#include "llvm/IR/CallSite.h"
>  #include "llvm/IR/Constants.h"
>  #include "llvm/IR/DebugInfoMetadata.h"
>  #include "llvm/IR/DerivedTypes.h"
> @@ -1682,12 +1683,36 @@
>    emitSignedInt64(Vals, diff);
>  }
>  
> +static void EmitOperandBundles(BitstreamWriter &Stream, ImmutableCallSite CS,
> +                               unsigned InstID, ValueEnumerator &VE,
> +                               unsigned Abbrev) {

It doesn't make sense to use the same abbrevation for all of these
unless the abbreviation is 0.  Might as well not pass anything in at
all in cases like this.  At least it should assert(0).

> +  SmallVector<unsigned, 64> Record;
> +
> +  Record.push_back(CS.getNumOperandBundles());
> +  Stream.EmitRecord(bitc::FUNC_CODE_OPERAND_BUNDLES_HEADER, Record, Abbrev);
> +  Record.clear();
> +
> +  for (unsigned i = 0, e = CS.getNumOperandBundles(); i != e; ++i) {
> +    const auto &Bundle = CS.getOperandBundle(i);
> +
> +    WriteStringRecord(bitc::FUNC_CODE_OPERAND_BUNDLE_TAG, Bundle.Tag,
> +                      Abbrev, Stream);
> +
> +    for (auto &Input : Bundle.Inputs)
> +      PushValueAndType(Input, InstID, Record, VE);
> +
> +    Stream.EmitRecord(bitc::FUNC_CODE_OPERAND_BUNDLE_CONTENTS, Record, Abbrev);
> +    Record.clear();
> +  }
> +}
> +
>  /// WriteInstruction - Emit an instruction to the specified stream.
>  static void WriteInstruction(const Instruction &I, unsigned InstID,
>                               ValueEnumerator &VE, BitstreamWriter &Stream,
>                               SmallVectorImpl<unsigned> &Vals) {
>    unsigned Code = 0;
>    unsigned AbbrevToUse = 0;
> +  bool RecordEmissionPending = true;
>    VE.setInstructionID(&I);
>    switch (I.getOpcode()) {
>    default:
> @@ -1827,7 +1852,8 @@
>      const InvokeInst *II = cast<InvokeInst>(&I);
>      const Value *Callee = II->getCalledValue();
>      FunctionType *FTy = II->getFunctionType();
> -    Code = bitc::FUNC_CODE_INST_INVOKE;
> +    Code = II->hasOperandBundles() ? bitc::FUNC_CODE_INST_INVOKE_WITH_OP_BUNDLE
> +                                   : bitc::FUNC_CODE_INST_INVOKE;

If you emit bundles first, you just need bitc::FUNC_CODE_INST_INVOKE.

>  
>      Vals.push_back(VE.getAttributeID(II->getAttributes()));
>      Vals.push_back(II->getCallingConv() | 1 << 13);
> @@ -1846,6 +1872,12 @@
>             i != e; ++i)
>          PushValueAndType(I.getOperand(i), InstID, Vals, VE); // vararg
>      }
> +
> +    Stream.EmitRecord(Code, Vals, AbbrevToUse);
> +    RecordEmissionPending = false;
> +
> +    if (II->hasOperandBundles())
> +      EmitOperandBundles(Stream, II, InstID, VE, AbbrevToUse);

If you emit the bundles first, then you don't need to emit the record
inside the case, and you don't need `RecordEmissionPending`.

>      break;
>    }
>    case Instruction::Resume:
> @@ -2036,7 +2068,8 @@
>      const CallInst &CI = cast<CallInst>(I);
>      FunctionType *FTy = CI.getFunctionType();
>  
> -    Code = bitc::FUNC_CODE_INST_CALL;
> +    Code = CI.hasOperandBundles() ? bitc::FUNC_CODE_INST_CALL_WITH_OP_BUNDLE
> +                                  : bitc::FUNC_CODE_INST_CALL;
>  
>      Vals.push_back(VE.getAttributeID(CI.getAttributes()));
>      Vals.push_back((CI.getCallingConv() << 1) | unsigned(CI.isTailCall()) |
> @@ -2059,6 +2092,11 @@
>             i != e; ++i)
>          PushValueAndType(CI.getArgOperand(i), InstID, Vals, VE);  // varargs
>      }
> +
> +    Stream.EmitRecord(Code, Vals, AbbrevToUse);
> +    RecordEmissionPending = false;
> +    if (CI.hasOperandBundles())
> +      EmitOperandBundles(Stream, &CI, InstID, VE, AbbrevToUse);
>      break;
>    }
>    case Instruction::VAArg:
> @@ -2069,7 +2107,9 @@
>      break;
>    }
>  
> -  Stream.EmitRecord(Code, Vals, AbbrevToUse);
> +  if (RecordEmissionPending)
> +    Stream.EmitRecord(Code, Vals, AbbrevToUse);
> +

You could leave this logic as is if the bundles were first.

>    Vals.clear();
>  }
>  
> Index: lib/Bitcode/Reader/BitcodeReader.cpp
> ===================================================================
> --- lib/Bitcode/Reader/BitcodeReader.cpp
> +++ lib/Bitcode/Reader/BitcodeReader.cpp
> @@ -388,6 +388,14 @@
>    std::error_code findFunctionInStream(
>        Function *F,
>        DenseMap<Function *, uint64_t>::iterator DeferredFunctionInfoIterator);
> +
> +  struct OperandBundleInfo {
> +    std::string Tag;
> +    SmallVector<Value *, 64> Inputs;
> +  };
> +
> +  std::error_code parseBundleInfos(SmallVectorImpl<OperandBundleInfo> &,
> +                                   unsigned AbbrevID, unsigned InstNum);
>  };
>  } // namespace
>  
> @@ -4127,8 +4135,11 @@
>        break;
>      }
>  
> -    case bitc::FUNC_CODE_INST_INVOKE: {
> +    case bitc::FUNC_CODE_INST_INVOKE_WITH_OP_BUNDLE: {
> +    case bitc::FUNC_CODE_INST_INVOKE:

(It's a little strange to open the `{` between two case statements.)

>        // INVOKE: [attrs, cc, normBB, unwindBB, fnty, op0,op1,op2, ...]
> +      // INVOKE_WITH_OP_BUNDLE: [paramattrs, cc, fnty, fnid, arg0, arg1...]
> +      //                 [ OP bundle header ][ OP bundle 0 ] [ OP bundle 1 ] ...
>        if (Record.size() < 4)
>          return error("Invalid record");
>        unsigned OpNum = 0;
> @@ -4180,7 +4191,19 @@
>          }
>        }
>  
> -      I = InvokeInst::Create(Callee, NormalBB, UnwindBB, Ops);
> +      SmallVector<OperandBundleInfo, 2> BundleInfos;
> +      SmallVector<OperandBundleSetDef, 2> Bundles;
> +
> +      if (BitCode == bitc::FUNC_CODE_INST_INVOKE_WITH_OP_BUNDLE) {
> +        if (std::error_code EC =
> +                parseBundleInfos(BundleInfos, Entry.ID, NextValueNo))
> +          return EC;
> +
> +        for (auto &ParsedBundle : BundleInfos)
> +          Bundles.emplace_back(ParsedBundle.Tag, ParsedBundle.Inputs);
> +      }
> +
> +      I = InvokeInst::Create(Callee, NormalBB, UnwindBB, Ops, Bundles);
>        InstructionList.push_back(I);
>        cast<InvokeInst>(I)
>            ->setCallingConv(static_cast<CallingConv::ID>(~(1U << 13) & CCInfo));
> @@ -4507,8 +4530,11 @@
>        InstructionList.push_back(I);
>        break;
>      }
> -    case bitc::FUNC_CODE_INST_CALL: {
> +    case bitc::FUNC_CODE_INST_CALL_WITH_OP_BUNDLE: {
> +    case bitc::FUNC_CODE_INST_CALL:
>        // CALL: [paramattrs, cc, fnty, fnid, arg0, arg1...]
> +      // CALL_WITH_OP_BUNDLE: [paramattrs, cc, fnty, fnid, arg0, arg1...]
> +      //                [ OP bundle header ] [ OP bundle 0 ] [ OP bundle 1 ] ...
>        if (Record.size() < 3)
>          return error("Invalid record");
>  
> @@ -4563,7 +4589,19 @@
>          }
>        }
>  
> -      I = CallInst::Create(FTy, Callee, Args);
> +      SmallVector<OperandBundleInfo, 2> BundleInfos;
> +      SmallVector<OperandBundleSetDef, 2> Bundles;
> +
> +      if (BitCode == bitc::FUNC_CODE_INST_CALL_WITH_OP_BUNDLE) {
> +        if (std::error_code EC =
> +                parseBundleInfos(BundleInfos, Entry.ID, NextValueNo))
> +          return EC;
> +
> +        for (auto &ParsedBundle : BundleInfos)
> +          Bundles.emplace_back(ParsedBundle.Tag, ParsedBundle.Inputs);
> +      }
> +
> +      I = CallInst::Create(FTy, Callee, Args, Bundles);
>        InstructionList.push_back(I);
>        cast<CallInst>(I)->setCallingConv(
>            static_cast<CallingConv::ID>((~(1U << 14) & CCInfo) >> 1));
> @@ -4775,6 +4813,63 @@
>    return std::error_code();
>  }
>  
> +std::error_code
> +BitcodeReader::parseBundleInfos(SmallVectorImpl<OperandBundleInfo> &Output,
> +                                unsigned AbbrevID, unsigned InstNum) {
> +  SmallVector<uint64_t, 64> Record;
> +
> +  auto ParseNextRecord = [&](unsigned RecordKind, const char *ErrorMsg) {

These could just be normal records (in the main `while (1)`), if the
bundles are emitted before the main instruction records.

> +    Record.clear();
> +
> +    BitstreamEntry Entry = Stream.advance();
> +    if (Entry.Kind != BitstreamEntry::Record)
> +      return error(ErrorMsg);
> +
> +    if (Stream.readRecord(AbbrevID, Record) != RecordKind)
> +      return error("Expected operand bundle header");
> +
> +    return std::error_code();
> +  };
> +
> +  if (std::error_code EC =
> +          ParseNextRecord(bitc::FUNC_CODE_OPERAND_BUNDLES_HEADER,
> +                          "Expected operand bundle header"))
> +    return EC;
> +
> +  unsigned OpBundleCount = (unsigned)Record[0];
> +  Output.reserve(OpBundleCount);
> +
> +  for (unsigned i = 0; i < OpBundleCount; i++) {

This loop can be implicit, gathering bundles implicitly.

> +    if (std::error_code EC = ParseNextRecord(bitc::FUNC_CODE_OPERAND_BUNDLE_TAG,
> +                                             "Expected operand bundle tag"))
> +      return EC;
> +
> +    std::string Tag;
> +    convertToString(Record, 0, Tag);
> +
> +    if (std::error_code EC =
> +            ParseNextRecord(bitc::FUNC_CODE_OPERAND_BUNDLE_CONTENTS,
> +                            "Expected operand bundle contents"))
> +      return EC;
> +
> +    SmallVector<Value *, 16> Inputs;
> +
> +    unsigned OpNum = 0;
> +    while (OpNum != Record.size()) {
> +      Value *Op;
> +      if (getValueTypePair(Record, OpNum, InstNum, Op))
> +        return error("Invalid record");
> +      Inputs.push_back(Op);
> +    }
> +
> +    Output.emplace_back();
> +    Output.back().Tag = std::move(Tag);
> +    Output.back().Inputs = std::move(Inputs);
> +  }
> +
> +  return std::error_code();
> +}
> +
>  std::vector<StructType *> BitcodeReader::getIdentifiedStructTypes() const {
>    return IdentifiedStructTypes;
>  }
> Index: include/llvm/Bitcode/LLVMBitCodes.h
> ===================================================================
> --- include/llvm/Bitcode/LLVMBitCodes.h
> +++ include/llvm/Bitcode/LLVMBitCodes.h
> @@ -363,6 +363,13 @@
>      FUNC_CODE_INST_CLEANUPPAD = 52, // CLEANUPPAD: [num,args...]
>      FUNC_CODE_INST_CATCHENDPAD = 53, // CATCHENDPAD: [] or [bb#]
>      FUNC_CODE_INST_CLEANUPENDPAD = 54, // CLEANUPENDPAD: [val] or [val,bb#]
> +
> +    FUNC_CODE_INST_CALL_WITH_OP_BUNDLE = 55, // CALL_WITH_OP_BUNDLE: CALL, [operand bundle]
> +    FUNC_CODE_INST_INVOKE_WITH_OP_BUNDLE = 56, // INVOKE_WITH_OP_BUNDLE: INVOKE, [operand bundle]
> +
> +    FUNC_CODE_OPERAND_BUNDLES_HEADER = 57, // BUNDLES_HEADER: [num]
> +    FUNC_CODE_OPERAND_BUNDLE_TAG = 58, // BUNDLE_TAG: [strchr x N]
> +    FUNC_CODE_OPERAND_BUNDLE_CONTENTS = 59, // BUNDLE_CONTENTS: [ values ...]

I think you just need one new FUNC_CODE, `FUNC_CODE_OPERAND_BUNDLE`.

>    };
>  
>    enum UseListCodes {
> Index: docs/LangRef.rst
> ===================================================================
> --- docs/LangRef.rst
> +++ docs/LangRef.rst
> @@ -1438,7 +1438,32 @@
>      the ELF x86-64 abi, but it can be disabled for some compilation
>      units.
>  
> -.. _moduleasm:

Did you mean to delete this anchor?

> +
> +.. _opbundles:
> +
> +Operand Bundles
> +---------------
> +
> +Operand bundles are tagged sets of SSA values that can be associated
> +with certain LLVM instructions (currently only ``call``s and
> +``invoke``s).  In a way they are like metadata, but dropping them is
> +incorrect and will change program semantics.
> +
> +Syntax::
> +    operand bundle set ::= '[' operand bundle ']'
> +    operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')'
> +    bundle operand ::= SSA value
> +    tag ::= string constant
> +
> +Operand bundles are **not** part of a function's signature, and a
> +given function may be called from multiple places with different kinds
> +of operand bundles.  This reflects the fact that the operand bundles
> +are conceptually a part of the ``call`` (or ``invoke``), not the
> +callee being dispatched to.
> +
> +Operand bundles are a generic mechanism intended to support
> +runtime-introspection like functionality for managed languages.  The

I would parse this better if "like" were hyphenated, as in "hyphen-like
syntax".

> +exact semantics of an operand bundle is dependent on the bundle tag.

(I'm assuming you'll describe the tags here as they're added?)

>  
>  Module-Level Inline Assembly
>  ----------------------------
> @@ -5017,7 +5042,7 @@
>  ::
>  
>        <result> = invoke [cconv] [ret attrs] <ptr to function ty> <function ptr val>(<function args>) [fn attrs]
> -                    to label <normal label> unwind label <exception label>
> +                    [ operand bundles ] to label <normal label> unwind label <exception label>
>  
>  Overview:
>  """""""""
> @@ -5071,6 +5096,7 @@
>  #. The optional :ref:`function attributes <fnattrs>` list. Only
>     '``noreturn``', '``nounwind``', '``readonly``' and '``readnone``'
>     attributes are valid here.
> +#. The optional :ref:`operand bundles <opbundles>` list.
>  
>  Semantics:
>  """"""""""
> @@ -8259,6 +8285,7 @@
>  ::
>  
>        <result> = [tail | musttail] call [cconv] [ret attrs] <ty> [<fnty>*] <fnptrval>(<function args>) [fn attrs]
> +                   [ operand bundles ]

Style seems to be "[operand bundles]", without spaces inside the
brackets.

>  
>  Overview:
>  """""""""
> @@ -8338,6 +8365,7 @@
>  #. The optional :ref:`function attributes <fnattrs>` list. Only
>     '``noreturn``', '``nounwind``', '``readonly``' and '``readnone``'
>     attributes are valid here.
> +#. The optional :ref:`operand bundles <opbundles>` list.
>  
>  Semantics:
>  """"""""""
> 



More information about the llvm-commits mailing list