[llvm-commits] [llvm] r72700 - in /llvm/trunk/tools/lto: LTOModule.cpp LTOModule.h

Nick Kledzik kledzik at apple.com
Mon Jun 1 14:43:20 PDT 2009


On Jun 1, 2009, at 2:11 PM, Nick Lewycky wrote:

> 2009/6/1 Nick Kledzik <kledzik at apple.com>
> Author: kledzik
> Date: Mon Jun  1 15:33:09 2009
> New Revision: 72700
>
> URL: http://llvm.org/viewvc/llvm-project?rev=72700&view=rev
> Log:
> <rdar://problem/6927148> libLTO needs to handle i386 magic objc  
> class symbols
> Parse __OBJC data structures and synthesize magic .objc_ symbols.
> Also, alter mangler so that objc method names are readable.
>
> Hi Nick, could you please elaborate on why exactly libLTO needs to  
> change? While I don't doubt that this fixes a bug, I can't see how  
> this language-specific logic could possibly belong in libLTO.
It is not just language specific.  It is only for Objective-C for ppc  
or i386 on Darwin.   My thinking was that the extra logic is only  
executed if the GlobalVariable has a custom section with a specific  
name, that it would not impact other languages or architectures.

This issue is that the old ObjC object format did some strange  
contortions to avoid real linker symbols.  For instance the ObjC class  
data structure is allocated statically in the executable that defines  
it.  That data structures contains a pointer to it superclass.  But  
instead of just initializing that part of the struct to the address of  
its superclass, and letting the static and dynamic linkers do the  
rest, the runtime works by having that field point to a C-string that  
is the name of the superclass on disk.  At runtime the objc  
initialization swaps that pointer out to point to the actual super  
class.  As far as the linkers know it is just a pointer to a string.   
But someone wanted the linker to issue errors at build time if the  
superclass was not found.  So they figured out a way in mach-o object  
format to use an absolute symbol (.objc_class_name_Foo = 0) and a  
floating reference ( .reference .objc_class_name_Bar) to trick the  
linker into erroring when a class was missing.   This patch emulates  
that same behavior.

In addition, when processing mach-o files, the current Darwin linker  
ignores those absolute and floating reference symbols, and instead  
parses the __OBJC,__class data structures and infers those symbols.   
The libLTO.dylib interfaces does not allow the linker to view the  
contents of a bitcode file, just its symbols.  So having libLTO  
synthesize those symbols fits well with the Darwin linker.

Would you prefer if the changes were wrapped in a isTargetDarwin() test?

-Nick



>
>
>
>
> Modified:
>    llvm/trunk/tools/lto/LTOModule.cpp
>    llvm/trunk/tools/lto/LTOModule.h
>
> Modified: llvm/trunk/tools/lto/LTOModule.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/lto/LTOModule.cpp?rev=72700&r1=72699&r2=72700&view=diff
>
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> --- llvm/trunk/tools/lto/LTOModule.cpp (original)
> +++ llvm/trunk/tools/lto/LTOModule.cpp Mon Jun  1 15:33:09 2009
> @@ -14,6 +14,7 @@
>
>  #include "LTOModule.h"
>
> +#include "llvm/Constants.h"
>  #include "llvm/Module.h"
>  #include "llvm/ModuleProvider.h"
>  #include "llvm/ADT/OwningPtr.h"
> @@ -176,11 +177,123 @@
>     }
>  }
>
> -void LTOModule::addDefinedDataSymbol(GlobalValue* v, Mangler  
> &mangler)
> +// get string that data pointer points to
> +bool LTOModule::objcClassNameFromExpression(Constant* c,  
> std::string& name)
> +{
> +    if (ConstantExpr* ce = dyn_cast<ConstantExpr>(c)) {
>
> Is there any reason not to just assert on this? Will it ever really  
> be called on constants that aren't known to be objcClassNames?
>
>
> +        Constant* op = ce->getOperand(0);
> +        if (GlobalVariable* gvn = dyn_cast<GlobalVariable>(op)) {
> +            Constant* cn = gvn->getInitializer();
> +            if (ConstantArray* ca = dyn_cast<ConstantArray>(cn)) {
> +                if ( ca->isCString() ) {
> +                    name = ".objc_class_name_" + ca->getAsString();
> +                    return true;
> +                }
> +            }
> +        }
> +    }
> +    return false;
> +}
> +
> +// parse i386/ppc ObjC class data structure
> +void LTOModule::addObjCClass(GlobalVariable* clgv)
> +{
> +    if (ConstantStruct* c = dyn_cast<ConstantStruct>(clgv- 
> >getInitializer())) {
>
> Here again, perhaps it should just assert that this is true?
>
>
> +        // second slot in __OBJC,__class is pointer to superclass  
> name
> +        std::string superclassName;
> +        if ( objcClassNameFromExpression(c->getOperand(1),  
> superclassName) ) {
> +            NameAndAttributes info;
> +            if ( _undefines.find(superclassName.c_str()) ==  
> _undefines.end() ) {
> +                const char* symbolName  
> = ::strdup(superclassName.c_str());
> +                info.name = ::strdup(symbolName);
> +                info.attributes = LTO_SYMBOL_DEFINITION_UNDEFINED;
> +                // string is owned by _undefines
> +                _undefines[info.name] = info;
>
> Perhaps the non-libLTO side of the Apple linker should ignore  
> symbols it knows are objC specific? Or perhaps you could run a  
> domain-specific LLVM pass which would delete these globals as  
> needed? If somehow you need them to exist but still be marked  
> undefined, perhaps libLTO should expose an interface for marking  
> symbols undefined instead of an interface specific to Apple's  
> implementation of the objC language?
>
> It just seems like this is breaking encapsulation, the linker has  
> linking rules that don't much care what the source language was.
>
> Nick
>
>
> +            }
> +        }
> +        // third slot in __OBJC,__class is pointer to class name
> +        std::string className;
> +         if ( objcClassNameFromExpression(c->getOperand(2),  
> className) ) {
> +            const char* symbolName = ::strdup(className.c_str());
> +            NameAndAttributes info;
> +            info.name = symbolName;
> +            info.attributes = (lto_symbol_attributes)
> +                (LTO_SYMBOL_PERMISSIONS_DATA |
> +                 LTO_SYMBOL_DEFINITION_REGULAR |
> +                 LTO_SYMBOL_SCOPE_DEFAULT);
> +            _symbols.push_back(info);
> +            _defines[info.name] = 1;
> +         }
> +    }
> +}
> +
> +
> +// parse i386/ppc ObjC category data structure
> +void LTOModule::addObjCCategory(GlobalVariable* clgv)
> +{
> +    if (ConstantStruct* c = dyn_cast<ConstantStruct>(clgv- 
> >getInitializer())) {
> +        // second slot in __OBJC,__category is pointer to target  
> class name
> +        std::string targetclassName;
> +        if ( objcClassNameFromExpression(c->getOperand(1),  
> targetclassName) ) {
> +            NameAndAttributes info;
> +            if ( _undefines.find(targetclassName.c_str()) ==  
> _undefines.end() ){
> +                const char* symbolName  
> = ::strdup(targetclassName.c_str());
> +                info.name = ::strdup(symbolName);
> +                info.attributes = LTO_SYMBOL_DEFINITION_UNDEFINED;
> +                // string is owned by _undefines
> +               _undefines[info.name] = info;
> +            }
> +        }
> +    }
> +}
> +
> +
> +// parse i386/ppc ObjC class list data structure
> +void LTOModule::addObjCClassRef(GlobalVariable* clgv)
> +{
> +    std::string targetclassName;
> +    if ( objcClassNameFromExpression(clgv->getInitializer(),  
> targetclassName) ){
> +        NameAndAttributes info;
> +        if ( _undefines.find(targetclassName.c_str()) ==  
> _undefines.end() ) {
> +            const char* symbolName  
> = ::strdup(targetclassName.c_str());
> +            info.name = ::strdup(symbolName);
> +            info.attributes = LTO_SYMBOL_DEFINITION_UNDEFINED;
> +            // string is owned by _undefines
> +            _undefines[info.name] = info;
> +        }
> +    }
> +}
> +
> +
> +void LTOModule::addDefinedDataSymbol(GlobalValue* v, Mangler&  
> mangler)
>  {
>     // add to list of defined symbols
>     addDefinedSymbol(v, mangler, false);
>
> +    // special case i386/ppc ObjC data structures in magic sections
> +    if ( v->hasSection() ) {
> +        // special case if this data blob is an ObjC class definition
> +        if ( v->getSection().compare(0, 15, "__OBJC,__class,") ==  
> 0 ) {
> +            if (GlobalVariable* gv = dyn_cast<GlobalVariable>(v)) {
> +                addObjCClass(gv);
> +            }
> +        }
> +
> +        // special case if this data blob is an ObjC category  
> definition
> +        else if ( v->getSection().compare(0, 18,  
> "__OBJC,__category,") == 0 ) {
> +            if (GlobalVariable* gv = dyn_cast<GlobalVariable>(v)) {
> +                addObjCCategory(gv);
> +            }
> +        }
> +
> +        // special case if this data blob is the list of referenced  
> classes
> +        else if ( v->getSection().compare(0, 18,  
> "__OBJC,__cls_refs,") == 0 ) {
> +            if (GlobalVariable* gv = dyn_cast<GlobalVariable>(v)) {
> +                addObjCClassRef(gv);
> +            }
> +        }
> +    }
> +
>     // add external symbols referenced by this data.
>     for (unsigned count = 0, total = v->getNumOperands();
>                                                 count != total; + 
> +count) {
> @@ -192,9 +305,13 @@
>  void LTOModule::addDefinedSymbol(GlobalValue* def, Mangler &mangler,
>                                 bool isFunction)
>  {
> +    // ignore all llvm.* symbols
> +    if ( strncmp(def->getNameStart(), "llvm.", 5) == 0 )
> +        return;
> +
>     // string is owned by _defines
>     const char* symbolName  
> = ::strdup(mangler.getValueName(def).c_str());
> -
> +
>     // set alignment part log2() can have rounding errors
>     uint32_t align = def->getAlignment();
>     uint32_t attr = align ? CountTrailingZeros_32(def- 
> >getAlignment()) : 0;
> @@ -241,25 +358,28 @@
>  }
>
>  void LTOModule::addAsmGlobalSymbol(const char *name) {
> -  // string is owned by _defines
> -  const char *symbolName = ::strdup(name);
> -  uint32_t attr = LTO_SYMBOL_DEFINITION_REGULAR;
> -  attr |= LTO_SYMBOL_SCOPE_DEFAULT;
> -
> -  // add to table of symbols
> -  NameAndAttributes info;
> -  info.name = symbolName;
> -  info.attributes = (lto_symbol_attributes)attr;
> -  _symbols.push_back(info);
> -  _defines[info.name] = 1;
> +    // only add new define if not already defined
> +    if ( _defines.count(name, &name[strlen(name)+1]) == 0 )
> +        return;
> +
> +    // string is owned by _defines
> +    const char *symbolName = ::strdup(name);
> +    uint32_t attr = LTO_SYMBOL_DEFINITION_REGULAR;
> +    attr |= LTO_SYMBOL_SCOPE_DEFAULT;
> +    NameAndAttributes info;
> +    info.name = symbolName;
> +    info.attributes = (lto_symbol_attributes)attr;
> +    _symbols.push_back(info);
> +    _defines[info.name] = 1;
>  }
>
>  void LTOModule::addPotentialUndefinedSymbol(GlobalValue* decl,  
> Mangler &mangler)
>  {
> -   const char* name = mangler.getValueName(decl).c_str();
>     // ignore all llvm.* symbols
> -    if ( strncmp(name, "llvm.", 5) == 0 )
> -      return;
> +    if ( strncmp(decl->getNameStart(), "llvm.", 5) == 0 )
> +        return;
> +
> +    const char* name = mangler.getValueName(decl).c_str();
>
>     // we already have the symbol
>     if (_undefines.find(name) != _undefines.end())
> @@ -306,6 +426,14 @@
>
>         // Use mangler to add GlobalPrefix to names to match linker  
> names.
>         Mangler mangler(*_module, _target->getTargetAsmInfo()- 
> >getGlobalPrefix());
> +        // add chars used in ObjC method names so method names  
> aren't mangled
> +        mangler.markCharAcceptable('[');
> +        mangler.markCharAcceptable(']');
> +        mangler.markCharAcceptable('(');
> +        mangler.markCharAcceptable(')');
> +        mangler.markCharAcceptable('-');
> +        mangler.markCharAcceptable('+');
> +        mangler.markCharAcceptable(' ');
>
>         // add functions
>         for (Module::iterator f = _module->begin(); f != _module- 
> >end(); ++f) {
>
> Modified: llvm/trunk/tools/lto/LTOModule.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/lto/LTOModule.h?rev=72700&r1=72699&r2=72700&view=diff
>
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> --- llvm/trunk/tools/lto/LTOModule.h (original)
> +++ llvm/trunk/tools/lto/LTOModule.h Mon Jun  1 15:33:09 2009
> @@ -77,13 +77,19 @@
>     void                    addDefinedDataSymbol(llvm::GlobalValue* v,
>                                                          
> llvm::Mangler &mangler);
>     void                    addAsmGlobalSymbol(const char *);
> +    void                    addObjCClass(llvm::GlobalVariable* clgv);
> +    void                    addObjCCategory(llvm::GlobalVariable*  
> clgv);
> +    void                    addObjCClassRef(llvm::GlobalVariable*  
> clgv);
> +    bool                     
> objcClassNameFromExpression(llvm::Constant* c,
> +                                                    std::string&  
> name);
> +
>     static bool             isTargetMatch(llvm::MemoryBuffer*  
> memBuffer,
>                                                     const char*  
> triplePrefix);
> -
> +
>     static LTOModule*       makeLTOModule(llvm::MemoryBuffer* buffer,
>                                                         std::string&  
> errMsg);
>     static llvm::MemoryBuffer* makeBuffer(const void* mem, size_t  
> length);
> -
> +
>     typedef llvm::StringMap<uint8_t> StringSet;
>
>     struct NameAndAttributes {
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20090601/75c6313e/attachment.html>


More information about the llvm-commits mailing list