[PATCH] Implement function prefix data as an IR feature.

Peter Collingbourne peter at pcc.me.uk
Wed Sep 4 21:33:51 PDT 2013


Ping^2.

On Wed, Aug 14, 2013 at 03:49:51PM -0700, Peter Collingbourne wrote:
> Ping.
> 
> On Fri, Aug 02, 2013 at 03:21:32PM -0700, Peter Collingbourne wrote:
> >     - Move prefix data storage into a side table.
> > 
> > http://llvm-reviews.chandlerc.com/D1191
> > 
> > CHANGE SINCE LAST DIFF
> >   http://llvm-reviews.chandlerc.com/D1191?vs=2928&id=3173#toc
> > 
> > Files:
> >   docs/BitCodeFormat.rst
> >   docs/LangRef.rst
> >   include/llvm/IR/Function.h
> >   lib/AsmParser/LLLexer.cpp
> >   lib/AsmParser/LLParser.cpp
> >   lib/AsmParser/LLToken.h
> >   lib/Bitcode/Reader/BitcodeReader.cpp
> >   lib/Bitcode/Reader/BitcodeReader.h
> >   lib/Bitcode/Writer/BitcodeWriter.cpp
> >   lib/Bitcode/Writer/ValueEnumerator.cpp
> >   lib/CodeGen/AsmPrinter/AsmPrinter.cpp
> >   lib/IR/AsmWriter.cpp
> >   lib/IR/Function.cpp
> >   lib/IR/LLVMContextImpl.h
> >   lib/IR/TypeFinder.cpp
> >   lib/Transforms/IPO/GlobalDCE.cpp
> >   test/CodeGen/X86/prefixdata.ll
> >   test/Feature/prefixdata.ll
> 
> > Index: docs/BitCodeFormat.rst
> > ===================================================================
> > --- docs/BitCodeFormat.rst
> > +++ docs/BitCodeFormat.rst
> > @@ -718,7 +718,7 @@
> >  MODULE_CODE_FUNCTION Record
> >  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >  
> > -``[FUNCTION, type, callingconv, isproto, linkage, paramattr, alignment, section, visibility, gc]``
> > +``[FUNCTION, type, callingconv, isproto, linkage, paramattr, alignment, section, visibility, gc, prefix]``
> >  
> >  The ``FUNCTION`` record (code 8) marks the declaration or definition of a
> >  function. The operand fields are:
> > @@ -757,6 +757,9 @@
> >  * *unnamed_addr*: If present and non-zero, indicates that the function has
> >    ``unnamed_addr``
> >  
> > +* *prefix*: If non-zero, the value index of the prefix data for this function,
> > +  plus 1.
> > +
> >  MODULE_CODE_ALIAS Record
> >  ^^^^^^^^^^^^^^^^^^^^^^^^
> >  
> > Index: docs/LangRef.rst
> > ===================================================================
> > --- docs/LangRef.rst
> > +++ docs/LangRef.rst
> > @@ -552,16 +552,16 @@
> >  name, a (possibly empty) argument list (each with optional :ref:`parameter
> >  attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
> >  an optional section, an optional alignment, an optional :ref:`garbage
> > -collector name <gc>`, an opening curly brace, a list of basic blocks,
> > -and a closing curly brace.
> > +collector name <gc>`, an optional :ref:`prefix <prefixdata>`, an opening
> > +curly brace, a list of basic blocks, and a closing curly brace.
> >  
> >  LLVM function declarations consist of the "``declare``" keyword, an
> >  optional :ref:`linkage type <linkage>`, an optional :ref:`visibility
> >  style <visibility>`, an optional :ref:`calling convention <callingconv>`,
> >  an optional ``unnamed_addr`` attribute, a return type, an optional
> >  :ref:`parameter attribute <paramattrs>` for the return type, a function
> > -name, a possibly empty list of arguments, an optional alignment, and an
> > -optional :ref:`garbage collector name <gc>`.
> > +name, a possibly empty list of arguments, an optional alignment, an optional
> > +:ref:`garbage collector name <gc>` and an optional :ref:`prefix <prefixdata>`.
> >  
> >  A function definition contains a list of basic blocks, forming the CFG
> >  (Control Flow Graph) for the function. Each basic block may optionally
> > @@ -598,7 +598,7 @@
> >             [cconv] [ret attrs]
> >             <ResultType> @<FunctionName> ([argument list])
> >             [fn Attrs] [section "name"] [align N]
> > -           [gc] { ... }
> > +           [gc] [prefix Constant] { ... }
> >  
> >  .. _langref_aliases:
> >  
> > @@ -757,6 +757,50 @@
> >  collector which will cause the compiler to alter its output in order to
> >  support the named garbage collection algorithm.
> >  
> > +.. _prefixdata:
> > +
> > +Prefix Data
> > +-----------
> > +
> > +Prefix data is data associated with a function which the code generator
> > +will emit immediately before the function body.  The purpose of this feature
> > +is to allow frontends to associate language-specific runtime metadata with
> > +specific functions and make it available through the function pointer while
> > +still allowing the function pointer to be called.  To access the data for a
> > +given function, a program may bitcast the function pointer to a pointer to
> > +the constant's type.  This implies that the IR symbol points to the start
> > +of the prefix data.
> > +
> > +To maintain the semantics of ordinary function calls, the prefix data must
> > +have a particular format.  Specifically, it must begin with a sequence of
> > +bytes which decode to a sequence of machine instructions, valid for the
> > +module's target, which transfer control to the point immediately succeeding
> > +the prefix data, without performing any other visible action.  This allows
> > +the inliner and other passes to reason about the semantics of the function
> > +definition without needing to reason about the prefix data.  Obviously this
> > +makes the format of the prefix data highly target dependent.
> > +
> > +A trivial example of valid prefix data for the x86 architecture is ``i8 144``,
> > +which encodes the ``nop`` instruction:
> > +
> > +.. code-block:: llvm
> > +
> > +    define void @f() prefix i8 144 { ... }
> > +
> > +Generally prefix data can be formed by encoding a relative branch instruction
> > +which skips the metadata, as in this example of valid prefix data for the
> > +x86_64 architecture, where the first two bytes encode ``jmp .+10``:
> > +
> > +.. code-block:: llvm
> > +
> > +    %0 = type <{ i8, i8, i8* }>
> > +
> > +    define void @f() prefix %0 <{ i8 235, i8 8, i8* @md}> { ... }
> > +
> > +A function may have prefix data but no body.  This has similar semantics
> > +to the ``available_externally`` linkage in that the data may be used by the
> > +optimizers but will not be emitted in the object file.
> > +
> >  .. _attrgrp:
> >  
> >  Attribute Groups
> > Index: include/llvm/IR/Function.h
> > ===================================================================
> > --- include/llvm/IR/Function.h
> > +++ include/llvm/IR/Function.h
> > @@ -159,11 +159,11 @@
> >    /// calling convention of this function.  The enum values for the known
> >    /// calling conventions are defined in CallingConv.h.
> >    CallingConv::ID getCallingConv() const {
> > -    return static_cast<CallingConv::ID>(getSubclassDataFromValue() >> 1);
> > +    return static_cast<CallingConv::ID>(getSubclassDataFromValue() >> 2);
> >    }
> >    void setCallingConv(CallingConv::ID CC) {
> > -    setValueSubclassData((getSubclassDataFromValue() & 1) |
> > -                         (static_cast<unsigned>(CC) << 1));
> > +    setValueSubclassData((getSubclassDataFromValue() & 3) |
> > +                         (static_cast<unsigned>(CC) << 2));
> >    }
> >  
> >    /// @brief Return the attribute list for this Function.
> > @@ -427,6 +427,13 @@
> >    size_t arg_size() const;
> >    bool arg_empty() const;
> >  
> > +  bool hasPrefixData() const {
> > +    return getSubclassDataFromValue() & 2;
> > +  }
> > +
> > +  Constant *getPrefixData() const;
> > +  void setPrefixData(Constant *PrefixData);
> > +
> >    /// viewCFG - This function is meant for use from the debugger.  You can just
> >    /// say 'call F->viewCFG()' and a ghostview window should pop up from the
> >    /// program, displaying the CFG of the current function with the code for each
> > Index: lib/AsmParser/LLLexer.cpp
> > ===================================================================
> > --- lib/AsmParser/LLLexer.cpp
> > +++ lib/AsmParser/LLLexer.cpp
> > @@ -540,6 +540,7 @@
> >    KEYWORD(alignstack);
> >    KEYWORD(inteldialect);
> >    KEYWORD(gc);
> > +  KEYWORD(prefix);
> >  
> >    KEYWORD(ccc);
> >    KEYWORD(fastcc);
> > Index: lib/AsmParser/LLParser.cpp
> > ===================================================================
> > --- lib/AsmParser/LLParser.cpp
> > +++ lib/AsmParser/LLParser.cpp
> > @@ -2919,7 +2919,7 @@
> >  /// FunctionHeader
> >  ///   ::= OptionalLinkage OptionalVisibility OptionalCallingConv OptRetAttrs
> >  ///       OptUnnamedAddr Type GlobalName '(' ArgList ')' OptFuncAttrs OptSection
> > -///       OptionalAlign OptGC
> > +///       OptionalAlign OptGC OptionalPrefix
> >  bool LLParser::ParseFunctionHeader(Function *&Fn, bool isDefine) {
> >    // Parse the linkage.
> >    LocTy LinkageLoc = Lex.getLoc();
> > @@ -2998,6 +2998,7 @@
> >    std::string GC;
> >    bool UnnamedAddr;
> >    LocTy UnnamedAddrLoc;
> > +  Constant *Prefix = 0;
> >  
> >    if (ParseArgumentList(ArgList, isVarArg) ||
> >        ParseOptionalToken(lltok::kw_unnamed_addr, UnnamedAddr,
> > @@ -3008,7 +3009,9 @@
> >         ParseStringConstant(Section)) ||
> >        ParseOptionalAlignment(Alignment) ||
> >        (EatIfPresent(lltok::kw_gc) &&
> > -       ParseStringConstant(GC)))
> > +       ParseStringConstant(GC)) ||
> > +      (EatIfPresent(lltok::kw_prefix) &&
> > +       ParseGlobalTypeAndValue(Prefix)))
> >      return true;
> >  
> >    if (FuncAttrs.contains(Attribute::Builtin))
> > @@ -3106,6 +3109,7 @@
> >    Fn->setAlignment(Alignment);
> >    Fn->setSection(Section);
> >    if (!GC.empty()) Fn->setGC(GC.c_str());
> > +  Fn->setPrefixData(Prefix);
> >    ForwardRefAttrGroups[Fn] = FwdRefAttrGrps;
> >  
> >    // Add all of the arguments we parsed to the function.
> > Index: lib/AsmParser/LLToken.h
> > ===================================================================
> > --- lib/AsmParser/LLToken.h
> > +++ lib/AsmParser/LLToken.h
> > @@ -81,6 +81,7 @@
> >      kw_alignstack,
> >      kw_inteldialect,
> >      kw_gc,
> > +    kw_prefix,
> >      kw_c,
> >  
> >      kw_cc, kw_ccc, kw_fastcc, kw_coldcc,
> > Index: lib/Bitcode/Reader/BitcodeReader.cpp
> > ===================================================================
> > --- lib/Bitcode/Reader/BitcodeReader.cpp
> > +++ lib/Bitcode/Reader/BitcodeReader.cpp
> > @@ -1103,9 +1103,11 @@
> >  bool BitcodeReader::ResolveGlobalAndAliasInits() {
> >    std::vector<std::pair<GlobalVariable*, unsigned> > GlobalInitWorklist;
> >    std::vector<std::pair<GlobalAlias*, unsigned> > AliasInitWorklist;
> > +  std::vector<std::pair<Function*, unsigned> > FunctionPrefixWorklist;
> >  
> >    GlobalInitWorklist.swap(GlobalInits);
> >    AliasInitWorklist.swap(AliasInits);
> > +  FunctionPrefixWorklist.swap(FunctionPrefixes);
> >  
> >    while (!GlobalInitWorklist.empty()) {
> >      unsigned ValID = GlobalInitWorklist.back().second;
> > @@ -1133,6 +1135,20 @@
> >      }
> >      AliasInitWorklist.pop_back();
> >    }
> > +
> > +  while (!FunctionPrefixWorklist.empty()) {
> > +    unsigned ValID = FunctionPrefixWorklist.back().second;
> > +    if (ValID >= ValueList.size()) {
> > +      FunctionPrefixes.push_back(FunctionPrefixWorklist.back());
> > +    } else {
> > +      if (Constant *C = dyn_cast<Constant>(ValueList[ValID]))
> > +        FunctionPrefixWorklist.back().first->setPrefixData(C);
> > +      else
> > +        return Error("Function prefix is not a constant!");
> > +    }
> > +    FunctionPrefixWorklist.pop_back();
> > +  }
> > +
> >    return false;
> >  }
> >  
> > @@ -1869,6 +1885,8 @@
> >        if (Record.size() > 9)
> >          UnnamedAddr = Record[9];
> >        Func->setUnnamedAddr(UnnamedAddr);
> > +      if (Record.size() > 10 && Record[10] != 0)
> > +        FunctionPrefixes.push_back(std::make_pair(Func, Record[10]-1));
> >        ValueList.push_back(Func);
> >  
> >        // If this is a function with a body, remember the prototype we are
> > Index: lib/Bitcode/Reader/BitcodeReader.h
> > ===================================================================
> > --- lib/Bitcode/Reader/BitcodeReader.h
> > +++ lib/Bitcode/Reader/BitcodeReader.h
> > @@ -142,6 +142,7 @@
> >  
> >    std::vector<std::pair<GlobalVariable*, unsigned> > GlobalInits;
> >    std::vector<std::pair<GlobalAlias*, unsigned> > AliasInits;
> > +  std::vector<std::pair<Function*, unsigned> > FunctionPrefixes;
> >  
> >    /// MAttributes - The set of attributes by index.  Index zero in the
> >    /// file is for null, and is thus not represented here.  As such all indices
> > Index: lib/Bitcode/Writer/BitcodeWriter.cpp
> > ===================================================================
> > --- lib/Bitcode/Writer/BitcodeWriter.cpp
> > +++ lib/Bitcode/Writer/BitcodeWriter.cpp
> > @@ -633,7 +633,7 @@
> >    // Emit the function proto information.
> >    for (Module::const_iterator F = M->begin(), E = M->end(); F != E; ++F) {
> >      // FUNCTION:  [type, callingconv, isproto, linkage, paramattrs, alignment,
> > -    //             section, visibility, gc, unnamed_addr]
> > +    //             section, visibility, gc, unnamed_addr, prefix]
> >      Vals.push_back(VE.getTypeID(F->getType()));
> >      Vals.push_back(F->getCallingConv());
> >      Vals.push_back(F->isDeclaration());
> > @@ -644,6 +644,8 @@
> >      Vals.push_back(getEncodedVisibility(F));
> >      Vals.push_back(F->hasGC() ? GCMap[F->getGC()] : 0);
> >      Vals.push_back(F->hasUnnamedAddr());
> > +    Vals.push_back(F->hasPrefixData() ? (VE.getValueID(F->getPrefixData()) + 1)
> > +                                      : 0);
> >  
> >      unsigned AbbrevToUse = 0;
> >      Stream.EmitRecord(bitc::MODULE_CODE_FUNCTION, Vals, AbbrevToUse);
> > @@ -1930,6 +1932,8 @@
> >      WriteUseList(FI, VE, Stream);
> >      if (!FI->isDeclaration())
> >        WriteFunctionUseList(FI, VE, Stream);
> > +    if (FI->hasPrefixData())
> > +      WriteUseList(FI->getPrefixData(), VE, Stream);
> >    }
> >  
> >    // Write the aliases.
> > Index: lib/Bitcode/Writer/ValueEnumerator.cpp
> > ===================================================================
> > --- lib/Bitcode/Writer/ValueEnumerator.cpp
> > +++ lib/Bitcode/Writer/ValueEnumerator.cpp
> > @@ -60,6 +60,11 @@
> >         I != E; ++I)
> >      EnumerateValue(I->getAliasee());
> >  
> > +  // Enumerate the prefix data constants.
> > +  for (Module::const_iterator I = M->begin(), E = M->end(); I != E; ++I)
> > +    if (I->hasPrefixData())
> > +      EnumerateValue(I->getPrefixData());
> > +
> >    // Insert constants and metadata that are named at module level into the slot
> >    // pool so that the module symbol table can refer to them...
> >    EnumerateValueSymbolTable(M->getValueSymbolTable());
> > Index: lib/CodeGen/AsmPrinter/AsmPrinter.cpp
> > ===================================================================
> > --- lib/CodeGen/AsmPrinter/AsmPrinter.cpp
> > +++ lib/CodeGen/AsmPrinter/AsmPrinter.cpp
> > @@ -468,6 +468,10 @@
> >      OutStreamer.EmitLabel(FakeStub);
> >    }
> >  
> > +  // Emit the prefix data.
> > +  if (F->hasPrefixData())
> > +    EmitGlobalConstant(F->getPrefixData());
> > +
> >    // Emit pre-function debug and/or EH information.
> >    if (DE) {
> >      NamedRegionTimer T(EHTimerName, DWARFGroupName, TimePassesIsEnabled);
> > Index: lib/IR/AsmWriter.cpp
> > ===================================================================
> > --- lib/IR/AsmWriter.cpp
> > +++ lib/IR/AsmWriter.cpp
> > @@ -1647,6 +1647,10 @@
> >      Out << " align " << F->getAlignment();
> >    if (F->hasGC())
> >      Out << " gc \"" << F->getGC() << '"';
> > +  if (F->hasPrefixData()) {
> > +    Out << " prefix ";
> > +    writeOperand(F->getPrefixData(), true);
> > +  }
> >    if (F->isDeclaration()) {
> >      Out << '\n';
> >    } else {
> > Index: lib/IR/Function.cpp
> > ===================================================================
> > --- lib/IR/Function.cpp
> > +++ lib/IR/Function.cpp
> > @@ -276,6 +276,9 @@
> >    // blockaddresses, but BasicBlock's destructor takes care of those.
> >    while (!BasicBlocks.empty())
> >      BasicBlocks.begin()->eraseFromParent();
> > +
> > +  // Prefix data is stored in a side table.
> > +  setPrefixData(0);
> >  }
> >  
> >  void Function::addAttribute(unsigned i, Attribute::AttrKind attr) {
> > @@ -351,6 +354,10 @@
> >      setGC(SrcF->getGC());
> >    else
> >      clearGC();
> > +  if (SrcF->hasPrefixData())
> > +    setPrefixData(SrcF->getPrefixData());
> > +  else
> > +    setPrefixData(0);
> >  }
> >  
> >  /// getIntrinsicID - This method returns the ID number of the specified
> > @@ -720,3 +727,32 @@
> >  
> >    return false;
> >  }
> > +
> > +Constant *Function::getPrefixData() const {
> > +  assert(hasPrefixData());
> > +  const LLVMContextImpl::PrefixDataMapTy &PDMap =
> > +      getContext().pImpl->PrefixDataMap;
> > +  assert(PDMap.find(this) != PDMap.end());
> > +  return cast<Constant>(PDMap.find(this)->second->getReturnValue());
> > +}
> > +
> > +void Function::setPrefixData(Constant *PrefixData) {
> > +  if (!PrefixData && !hasPrefixData())
> > +    return;
> > +
> > +  unsigned SCData = getSubclassDataFromValue();
> > +  LLVMContextImpl::PrefixDataMapTy &PDMap = getContext().pImpl->PrefixDataMap;
> > +  ReturnInst *&PDHolder = PDMap[this];
> > +  if (PrefixData) {
> > +    if (PDHolder)
> > +      PDHolder->setOperand(0, PrefixData);
> > +    else
> > +      PDHolder = ReturnInst::Create(getContext(), PrefixData);
> > +    SCData |= 2;
> > +  } else {
> > +    delete PDHolder;
> > +    PDMap.erase(this);
> > +    SCData &= ~2;
> > +  }
> > +  setValueSubclassData(SCData);
> > +}
> > Index: lib/IR/LLVMContextImpl.h
> > ===================================================================
> > --- lib/IR/LLVMContextImpl.h
> > +++ lib/IR/LLVMContextImpl.h
> > @@ -355,6 +355,11 @@
> >    typedef DenseMap<const Function*, unsigned> IntrinsicIDCacheTy;
> >    IntrinsicIDCacheTy IntrinsicIDCache;
> >  
> > +  /// \brief Mapping from a function to its prefix data, which is stored as the
> > +  /// operand of an unparented ReturnInst so that the prefix data has a Use.
> > +  typedef DenseMap<const Function *, ReturnInst *> PrefixDataMapTy;
> > +  PrefixDataMapTy PrefixDataMap;
> > +
> >    int getOrAddScopeRecordIdxEntry(MDNode *N, int ExistingIdx);
> >    int getOrAddScopeInlinedAtIdxEntry(MDNode *Scope, MDNode *IA,int ExistingIdx);
> >    
> > Index: lib/IR/TypeFinder.cpp
> > ===================================================================
> > --- lib/IR/TypeFinder.cpp
> > +++ lib/IR/TypeFinder.cpp
> > @@ -44,6 +44,9 @@
> >    for (Module::const_iterator FI = M.begin(), E = M.end(); FI != E; ++FI) {
> >      incorporateType(FI->getType());
> >  
> > +    if (FI->hasPrefixData())
> > +      incorporateValue(FI->getPrefixData());
> > +
> >      // First incorporate the arguments.
> >      for (Function::const_arg_iterator AI = FI->arg_begin(),
> >             AE = FI->arg_end(); AI != AE; ++AI)
> > Index: lib/Transforms/IPO/GlobalDCE.cpp
> > ===================================================================
> > --- lib/Transforms/IPO/GlobalDCE.cpp
> > +++ lib/Transforms/IPO/GlobalDCE.cpp
> > @@ -179,6 +179,9 @@
> >      // any globals used will be marked as needed.
> >      Function *F = cast<Function>(G);
> >  
> > +    if (F->hasPrefixData())
> > +      MarkUsedGlobalsAsNeeded(F->getPrefixData());
> > +
> >      for (Function::iterator BB = F->begin(), E = F->end(); BB != E; ++BB)
> >        for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; ++I)
> >          for (User::op_iterator U = I->op_begin(), E = I->op_end(); U != E; ++U)
> > Index: test/CodeGen/X86/prefixdata.ll
> > ===================================================================
> > --- /dev/null
> > +++ test/CodeGen/X86/prefixdata.ll
> > @@ -0,0 +1,15 @@
> > +; RUN: llc < %s -mtriple=x86_64-unknown-unknown | FileCheck %s
> > +
> > + at i = linkonce_odr global i32 1
> > +
> > +; CHECK: f:
> > +; CHECK-NEXT: .long	1
> > +define void @f() prefix i32 1 {
> > +  ret void
> > +}
> > +
> > +; CHECK: g:
> > +; CHECK-NEXT: .quad	i
> > +define void @g() prefix i32* @i {
> > +  ret void
> > +}
> > Index: test/Feature/prefixdata.ll
> > ===================================================================
> > --- /dev/null
> > +++ test/Feature/prefixdata.ll
> > @@ -0,0 +1,18 @@
> > +; RUN: llvm-as < %s | llvm-dis > %t1.ll
> > +; RUN: FileCheck %s < %t1.ll
> > +; RUN: llvm-as < %t1.ll | llvm-dis > %t2.ll
> > +; RUN: diff %t1.ll %t2.ll
> > +; RUN: opt -O3 -S < %t1.ll | FileCheck %s
> > +
> > +; CHECK: @i
> > + at i = linkonce_odr global i32 1
> > +
> > +; CHECK: f(){{.*}}prefix i32 1
> > +define void @f() prefix i32 1 {
> > +  ret void
> > +}
> > +
> > +; CHECK: g(){{.*}}prefix i32* @i
> > +define void @g() prefix i32* @i {
> > +  ret void
> > +}
> 
> 
> -- 
> Peter

-- 
Peter



More information about the llvm-commits mailing list