[llvm-commits] [llvm] r72700 - in /llvm/trunk/tools/lto: LTOModule.cpp LTOModule.h

Nick Kledzik kledzik at apple.com
Mon Jun 1 16:46:15 PDT 2009


On Jun 1, 2009, at 4:20 PM, Nick Lewycky wrote:

> 2009/6/1 Nick Kledzik <kledzik at apple.com>
>
> On Jun 1, 2009, at 2:11 PM, Nick Lewycky wrote:
>
>> 2009/6/1 Nick Kledzik <kledzik at apple.com>
>> Author: kledzik
>> Date: Mon Jun  1 15:33:09 2009
>> New Revision: 72700
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=72700&view=rev
>> Log:
>> <rdar://problem/6927148> libLTO needs to handle i386 magic objc  
>> class symbols
>> Parse __OBJC data structures and synthesize magic .objc_ symbols.
>> Also, alter mangler so that objc method names are readable.
>>
>> Hi Nick, could you please elaborate on why exactly libLTO needs to  
>> change? While I don't doubt that this fixes a bug, I can't see how  
>> this language-specific logic could possibly belong in libLTO.
>
> It is not just language specific.  It is only for Objective-C for  
> ppc or i386 on Darwin.   My thinking was that the extra logic is  
> only executed if the GlobalVariable has a custom section with a  
> specific name, that it would not impact other languages or  
> architectures.
>
> That's a good policy.
>
> This issue is that the old ObjC object format did some strange  
> contortions to avoid real linker symbols.  For instance the ObjC  
> class data structure is allocated statically in the executable that  
> defines it.  That data structures contains a pointer to it  
> superclass.  But instead of just initializing that part of the  
> struct to the address of its superclass, and letting the static and  
> dynamic linkers do the rest, the runtime works by having that field  
> point to a C-string that is the name of the superclass on disk.  At  
> runtime the objc initialization swaps that pointer out to point to  
> the actual super class.  As far as the linkers know it is just a  
> pointer to a string.  But someone wanted the linker to issue errors  
> at build time if the superclass was not found.  So they figured out  
> a way in mach-o object format to use an absolute symbol  
> (.objc_class_name_Foo = 0) and a floating reference  
> ( .reference .objc_class_name_Bar) to trick the linker into erroring  
> when a class was missing.   This patch emulates that same behavior.
>
> Tricky!
>
> I haven't thought it through fully but it sounds like you may be  
> able to simulate some of this with the private linkage type which is  
> described in LangRef as "This doesn't show up in any symbol table in  
> the object file."
We might be able to have the front end name the class data structure  
with a .objc_class_name name, but the real problem is the contents of  
the various data structures that need to both point to a string and  
(for error checking and archive loading) reference a .objc_class_name  
symbol.

>
>
> At the very least, I'd appreciate a comment block in libLTO  
> explaining why this stuff is there. Pretty much the same thing you  
> wrote in the paragraph above.
I just committed that as a comment.   Thanks for nudging me to do that.

-Nick

>
>
> In addition, when processing mach-o files, the current Darwin linker  
> ignores those absolute and floating reference symbols, and instead  
> parses the __OBJC,__class data structures and infers those symbols.   
> The libLTO.dylib interfaces does not allow the linker to view the  
> contents of a bitcode file, just its symbols.  So having libLTO  
> synthesize those symbols fits well with the Darwin linker.
>
> Would you prefer if the changes were wrapped in a isTargetDarwin()  
> test?
>
> It doesn't much matter because nobody else will call these new add*  
> methods. (Right?) The mangler change has me a little concerned but I  
> can wrap that in an isTargetDarwin() myself if I find that it is  
> responsible for breaking things on me.
>
> Also, thanks for taking the time to write up the rationale!
>
> Nick
>
>
> -Nick
>
>
>
>>
>>
>>
>>
>> Modified:
>>    llvm/trunk/tools/lto/LTOModule.cpp
>>    llvm/trunk/tools/lto/LTOModule.h
>>
>> Modified: llvm/trunk/tools/lto/LTOModule.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/lto/LTOModule.cpp?rev=72700&r1=72699&r2=72700&view=diff
>>
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> =====================================================================
>> --- llvm/trunk/tools/lto/LTOModule.cpp (original)
>> +++ llvm/trunk/tools/lto/LTOModule.cpp Mon Jun  1 15:33:09 2009
>> @@ -14,6 +14,7 @@
>>
>>  #include "LTOModule.h"
>>
>> +#include "llvm/Constants.h"
>>  #include "llvm/Module.h"
>>  #include "llvm/ModuleProvider.h"
>>  #include "llvm/ADT/OwningPtr.h"
>> @@ -176,11 +177,123 @@
>>     }
>>  }
>>
>> -void LTOModule::addDefinedDataSymbol(GlobalValue* v, Mangler  
>> &mangler)
>> +// get string that data pointer points to
>> +bool LTOModule::objcClassNameFromExpression(Constant* c,  
>> std::string& name)
>> +{
>> +    if (ConstantExpr* ce = dyn_cast<ConstantExpr>(c)) {
>>
>> Is there any reason not to just assert on this? Will it ever really  
>> be called on constants that aren't known to be objcClassNames?
>>
>>
>> +        Constant* op = ce->getOperand(0);
>> +        if (GlobalVariable* gvn = dyn_cast<GlobalVariable>(op)) {
>> +            Constant* cn = gvn->getInitializer();
>> +            if (ConstantArray* ca = dyn_cast<ConstantArray>(cn)) {
>> +                if ( ca->isCString() ) {
>> +                    name = ".objc_class_name_" + ca->getAsString();
>> +                    return true;
>> +                }
>> +            }
>> +        }
>> +    }
>> +    return false;
>> +}
>> +
>> +// parse i386/ppc ObjC class data structure
>> +void LTOModule::addObjCClass(GlobalVariable* clgv)
>> +{
>> +    if (ConstantStruct* c = dyn_cast<ConstantStruct>(clgv- 
>> >getInitializer())) {
>>
>> Here again, perhaps it should just assert that this is true?
>>
>>
>> +        // second slot in __OBJC,__class is pointer to superclass  
>> name
>> +        std::string superclassName;
>> +        if ( objcClassNameFromExpression(c->getOperand(1),  
>> superclassName) ) {
>> +            NameAndAttributes info;
>> +            if ( _undefines.find(superclassName.c_str()) ==  
>> _undefines.end() ) {
>> +                const char* symbolName  
>> = ::strdup(superclassName.c_str());
>> +                info.name = ::strdup(symbolName);
>> +                info.attributes = LTO_SYMBOL_DEFINITION_UNDEFINED;
>> +                // string is owned by _undefines
>> +                _undefines[info.name] = info;
>>
>> Perhaps the non-libLTO side of the Apple linker should ignore  
>> symbols it knows are objC specific? Or perhaps you could run a  
>> domain-specific LLVM pass which would delete these globals as  
>> needed? If somehow you need them to exist but still be marked  
>> undefined, perhaps libLTO should expose an interface for marking  
>> symbols undefined instead of an interface specific to Apple's  
>> implementation of the objC language?
>>
>> It just seems like this is breaking encapsulation, the linker has  
>> linking rules that don't much care what the source language was.
>>
>> Nick
>>
>>
>> +            }
>> +        }
>> +        // third slot in __OBJC,__class is pointer to class name
>> +        std::string className;
>> +         if ( objcClassNameFromExpression(c->getOperand(2),  
>> className) ) {
>> +            const char* symbolName = ::strdup(className.c_str());
>> +            NameAndAttributes info;
>> +            info.name = symbolName;
>> +            info.attributes = (lto_symbol_attributes)
>> +                (LTO_SYMBOL_PERMISSIONS_DATA |
>> +                 LTO_SYMBOL_DEFINITION_REGULAR |
>> +                 LTO_SYMBOL_SCOPE_DEFAULT);
>> +            _symbols.push_back(info);
>> +            _defines[info.name] = 1;
>> +         }
>> +    }
>> +}
>> +
>> +
>> +// parse i386/ppc ObjC category data structure
>> +void LTOModule::addObjCCategory(GlobalVariable* clgv)
>> +{
>> +    if (ConstantStruct* c = dyn_cast<ConstantStruct>(clgv- 
>> >getInitializer())) {
>> +        // second slot in __OBJC,__category is pointer to target  
>> class name
>> +        std::string targetclassName;
>> +        if ( objcClassNameFromExpression(c->getOperand(1),  
>> targetclassName) ) {
>> +            NameAndAttributes info;
>> +            if ( _undefines.find(targetclassName.c_str()) ==  
>> _undefines.end() ){
>> +                const char* symbolName  
>> = ::strdup(targetclassName.c_str());
>> +                info.name = ::strdup(symbolName);
>> +                info.attributes = LTO_SYMBOL_DEFINITION_UNDEFINED;
>> +                // string is owned by _undefines
>> +               _undefines[info.name] = info;
>> +            }
>> +        }
>> +    }
>> +}
>> +
>> +
>> +// parse i386/ppc ObjC class list data structure
>> +void LTOModule::addObjCClassRef(GlobalVariable* clgv)
>> +{
>> +    std::string targetclassName;
>> +    if ( objcClassNameFromExpression(clgv->getInitializer(),  
>> targetclassName) ){
>> +        NameAndAttributes info;
>> +        if ( _undefines.find(targetclassName.c_str()) ==  
>> _undefines.end() ) {
>> +            const char* symbolName  
>> = ::strdup(targetclassName.c_str());
>> +            info.name = ::strdup(symbolName);
>> +            info.attributes = LTO_SYMBOL_DEFINITION_UNDEFINED;
>> +            // string is owned by _undefines
>> +            _undefines[info.name] = info;
>> +        }
>> +    }
>> +}
>> +
>> +
>> +void LTOModule::addDefinedDataSymbol(GlobalValue* v, Mangler&  
>> mangler)
>>  {
>>     // add to list of defined symbols
>>     addDefinedSymbol(v, mangler, false);
>>
>> +    // special case i386/ppc ObjC data structures in magic sections
>> +    if ( v->hasSection() ) {
>> +        // special case if this data blob is an ObjC class  
>> definition
>> +        if ( v->getSection().compare(0, 15, "__OBJC,__class,") ==  
>> 0 ) {
>> +            if (GlobalVariable* gv = dyn_cast<GlobalVariable>(v)) {
>> +                addObjCClass(gv);
>> +            }
>> +        }
>> +
>> +        // special case if this data blob is an ObjC category  
>> definition
>> +        else if ( v->getSection().compare(0, 18,  
>> "__OBJC,__category,") == 0 ) {
>> +            if (GlobalVariable* gv = dyn_cast<GlobalVariable>(v)) {
>> +                addObjCCategory(gv);
>> +            }
>> +        }
>> +
>> +        // special case if this data blob is the list of  
>> referenced classes
>> +        else if ( v->getSection().compare(0, 18,  
>> "__OBJC,__cls_refs,") == 0 ) {
>> +            if (GlobalVariable* gv = dyn_cast<GlobalVariable>(v)) {
>> +                addObjCClassRef(gv);
>> +            }
>> +        }
>> +    }
>> +
>>     // add external symbols referenced by this data.
>>     for (unsigned count = 0, total = v->getNumOperands();
>>                                                 count != total; + 
>> +count) {
>> @@ -192,9 +305,13 @@
>>  void LTOModule::addDefinedSymbol(GlobalValue* def, Mangler &mangler,
>>                                 bool isFunction)
>>  {
>> +    // ignore all llvm.* symbols
>> +    if ( strncmp(def->getNameStart(), "llvm.", 5) == 0 )
>> +        return;
>> +
>>     // string is owned by _defines
>>     const char* symbolName  
>> = ::strdup(mangler.getValueName(def).c_str());
>> -
>> +
>>     // set alignment part log2() can have rounding errors
>>     uint32_t align = def->getAlignment();
>>     uint32_t attr = align ? CountTrailingZeros_32(def- 
>> >getAlignment()) : 0;
>> @@ -241,25 +358,28 @@
>>  }
>>
>>  void LTOModule::addAsmGlobalSymbol(const char *name) {
>> -  // string is owned by _defines
>> -  const char *symbolName = ::strdup(name);
>> -  uint32_t attr = LTO_SYMBOL_DEFINITION_REGULAR;
>> -  attr |= LTO_SYMBOL_SCOPE_DEFAULT;
>> -
>> -  // add to table of symbols
>> -  NameAndAttributes info;
>> -  info.name = symbolName;
>> -  info.attributes = (lto_symbol_attributes)attr;
>> -  _symbols.push_back(info);
>> -  _defines[info.name] = 1;
>> +    // only add new define if not already defined
>> +    if ( _defines.count(name, &name[strlen(name)+1]) == 0 )
>> +        return;
>> +
>> +    // string is owned by _defines
>> +    const char *symbolName = ::strdup(name);
>> +    uint32_t attr = LTO_SYMBOL_DEFINITION_REGULAR;
>> +    attr |= LTO_SYMBOL_SCOPE_DEFAULT;
>> +    NameAndAttributes info;
>> +    info.name = symbolName;
>> +    info.attributes = (lto_symbol_attributes)attr;
>> +    _symbols.push_back(info);
>> +    _defines[info.name] = 1;
>>  }
>>
>>  void LTOModule::addPotentialUndefinedSymbol(GlobalValue* decl,  
>> Mangler &mangler)
>>  {
>> -   const char* name = mangler.getValueName(decl).c_str();
>>     // ignore all llvm.* symbols
>> -    if ( strncmp(name, "llvm.", 5) == 0 )
>> -      return;
>> +    if ( strncmp(decl->getNameStart(), "llvm.", 5) == 0 )
>> +        return;
>> +
>> +    const char* name = mangler.getValueName(decl).c_str();
>>
>>     // we already have the symbol
>>     if (_undefines.find(name) != _undefines.end())
>> @@ -306,6 +426,14 @@
>>
>>         // Use mangler to add GlobalPrefix to names to match linker  
>> names.
>>         Mangler mangler(*_module, _target->getTargetAsmInfo()- 
>> >getGlobalPrefix());
>> +        // add chars used in ObjC method names so method names  
>> aren't mangled
>> +        mangler.markCharAcceptable('[');
>> +        mangler.markCharAcceptable(']');
>> +        mangler.markCharAcceptable('(');
>> +        mangler.markCharAcceptable(')');
>> +        mangler.markCharAcceptable('-');
>> +        mangler.markCharAcceptable('+');
>> +        mangler.markCharAcceptable(' ');
>>
>>         // add functions
>>         for (Module::iterator f = _module->begin(); f != _module- 
>> >end(); ++f) {
>>
>> Modified: llvm/trunk/tools/lto/LTOModule.h
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/lto/LTOModule.h?rev=72700&r1=72699&r2=72700&view=diff
>>
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> =====================================================================
>> --- llvm/trunk/tools/lto/LTOModule.h (original)
>> +++ llvm/trunk/tools/lto/LTOModule.h Mon Jun  1 15:33:09 2009
>> @@ -77,13 +77,19 @@
>>     void                    addDefinedDataSymbol(llvm::GlobalValue*  
>> v,
>>                                                          
>> llvm::Mangler &mangler);
>>     void                    addAsmGlobalSymbol(const char *);
>> +    void                    addObjCClass(llvm::GlobalVariable*  
>> clgv);
>> +    void                    addObjCCategory(llvm::GlobalVariable*  
>> clgv);
>> +    void                    addObjCClassRef(llvm::GlobalVariable*  
>> clgv);
>> +    bool                     
>> objcClassNameFromExpression(llvm::Constant* c,
>> +                                                    std::string&  
>> name);
>> +
>>     static bool             isTargetMatch(llvm::MemoryBuffer*  
>> memBuffer,
>>                                                     const char*  
>> triplePrefix);
>> -
>> +
>>     static LTOModule*       makeLTOModule(llvm::MemoryBuffer* buffer,
>>                                                          
>> std::string& errMsg);
>>     static llvm::MemoryBuffer* makeBuffer(const void* mem, size_t  
>> length);
>> -
>> +
>>     typedef llvm::StringMap<uint8_t> StringSet;
>>
>>     struct NameAndAttributes {
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20090601/a311c75a/attachment.html>


More information about the llvm-commits mailing list