[llvm-commits] [llvm] r72700 - in /llvm/trunk/tools/lto: LTOModule.cpp LTOModule.h

Nick Lewycky nlewycky at google.com
Mon Jun 1 16:20:20 PDT 2009


2009/6/1 Nick Kledzik <kledzik at apple.com>

>
> On Jun 1, 2009, at 2:11 PM, Nick Lewycky wrote:
>
> 2009/6/1 Nick Kledzik <kledzik at apple.com>
>
>> Author: kledzik
>> Date: Mon Jun  1 15:33:09 2009
>> New Revision: 72700
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=72700&view=rev
>> Log:
>> <rdar://problem/6927148> libLTO needs to handle i386 magic objc class
>> symbols
>> Parse __OBJC data structures and synthesize magic .objc_ symbols.
>> Also, alter mangler so that objc method names are readable.
>
>
> Hi Nick, could you please elaborate on why exactly libLTO needs to change?
> While I don't doubt that this fixes a bug, I can't see how this
> language-specific logic could possibly belong in libLTO.
>
> It is not just language specific.  It is only for Objective-C for ppc or
> i386 on Darwin.   My thinking was that the extra logic is only executed if
> the GlobalVariable has a custom section with a specific name, that it would
> not impact other languages or architectures.
>

That's a good policy.


> This issue is that the old ObjC object format did some
> strange contortions to avoid real linker symbols.  For instance the ObjC
> class data structure is allocated statically in the executable that defines
> it.  That data structures contains a pointer to it superclass.  But instead
> of just initializing that part of the struct to the address of its
> superclass, and letting the static and dynamic linkers do the rest, the
> runtime works by having that field point to a C-string that is the name of
> the superclass on disk.  At runtime the objc initialization swaps that
> pointer out to point to the actual super class.  As far as the linkers know
> it is just a pointer to a string.  But someone wanted the linker to issue
> errors at build time if the superclass was not found.  So they figured out a
> way in mach-o object format to use an absolute symbol (.objc_class_name_Foo
> = 0) and a floating reference ( .reference .objc_class_name_Bar) to trick
> the linker into erroring when a class was missing.   This patch emulates
> that same behavior.
>

Tricky!

I haven't thought it through fully but it sounds like you may be able to
simulate some of this with the private linkage type which is described in
LangRef as "This doesn't show up in any symbol table in the object file."

At the very least, I'd appreciate a comment block in libLTO explaining why
this stuff is there. Pretty much the same thing you wrote in the paragraph
above.

In addition, when processing mach-o files, the current Darwin linker ignores
> those absolute and floating reference symbols, and instead parses the
> __OBJC,__class data structures and infers those symbols.  The libLTO.dylib
> interfaces does not allow the linker to view the contents of a bitcode file,
> just its symbols.  So having libLTO synthesize those symbols fits well with
> the Darwin linker.
>
> Would you prefer if the changes were wrapped in a isTargetDarwin() test?
>

It doesn't much matter because nobody else will call these new add* methods.
(Right?) The mangler change has me a little concerned but I can wrap that in
an isTargetDarwin() myself if I find that it is responsible for breaking
things on me.

Also, thanks for taking the time to write up the rationale!

Nick


>
> -Nick
>
>
>
>
>
>
>>
>>
>> Modified:
>>    llvm/trunk/tools/lto/LTOModule.cpp
>>    llvm/trunk/tools/lto/LTOModule.h
>>
>> Modified: llvm/trunk/tools/lto/LTOModule.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/lto/LTOModule.cpp?rev=72700&r1=72699&r2=72700&view=diff
>>
>>
>> ==============================================================================
>> --- llvm/trunk/tools/lto/LTOModule.cpp (original)
>> +++ llvm/trunk/tools/lto/LTOModule.cpp Mon Jun  1 15:33:09 2009
>> @@ -14,6 +14,7 @@
>>
>>  #include "LTOModule.h"
>>
>> +#include "llvm/Constants.h"
>>  #include "llvm/Module.h"
>>  #include "llvm/ModuleProvider.h"
>>  #include "llvm/ADT/OwningPtr.h"
>> @@ -176,11 +177,123 @@
>>     }
>>  }
>>
>> -void LTOModule::addDefinedDataSymbol(GlobalValue* v, Mangler &mangler)
>> +// get string that data pointer points to
>> +bool LTOModule::objcClassNameFromExpression(Constant* c, std::string&
>> name)
>> +{
>> +    if (ConstantExpr* ce = dyn_cast<ConstantExpr>(c)) {
>
>
> Is there any reason not to just assert on this? Will it ever really be
> called on constants that aren't known to be objcClassNames?
>
>
>>
>> +        Constant* op = ce->getOperand(0);
>> +        if (GlobalVariable* gvn = dyn_cast<GlobalVariable>(op)) {
>> +            Constant* cn = gvn->getInitializer();
>> +            if (ConstantArray* ca = dyn_cast<ConstantArray>(cn)) {
>> +                if ( ca->isCString() ) {
>> +                    name = ".objc_class_name_" + ca->getAsString();
>> +                    return true;
>> +                }
>> +            }
>> +        }
>> +    }
>> +    return false;
>> +}
>> +
>> +// parse i386/ppc ObjC class data structure
>> +void LTOModule::addObjCClass(GlobalVariable* clgv)
>> +{
>> +    if (ConstantStruct* c =
>> dyn_cast<ConstantStruct>(clgv->getInitializer())) {
>
>
> Here again, perhaps it should just assert that this is true?
>
>
>>
>> +        // second slot in __OBJC,__class is pointer to superclass name
>> +        std::string superclassName;
>> +        if ( objcClassNameFromExpression(c->getOperand(1),
>> superclassName) ) {
>> +            NameAndAttributes info;
>> +            if ( _undefines.find(superclassName.c_str()) ==
>> _undefines.end() ) {
>> +                const char* symbolName =
>> ::strdup(superclassName.c_str());
>> +                info.name = ::strdup(symbolName);
>> +                info.attributes = LTO_SYMBOL_DEFINITION_UNDEFINED;
>> +                // string is owned by _undefines
>> +                _undefines[info.name] = info;
>
>
> Perhaps the non-libLTO side of the Apple linker should ignore symbols it
> knows are objC specific? Or perhaps you could run a domain-specific LLVM
> pass which would delete these globals as needed? If somehow you need them to
> exist but still be marked undefined, perhaps libLTO should expose an
> interface for marking symbols undefined instead of an interface specific to
> Apple's implementation of the objC language?
>
> It just seems like this is breaking encapsulation, the linker has linking
> rules that don't much care what the source language was.
>
> Nick
>
>
>>
>> +            }
>> +        }
>> +        // third slot in __OBJC,__class is pointer to class name
>> +        std::string className;
>> +         if ( objcClassNameFromExpression(c->getOperand(2), className) )
>> {
>> +            const char* symbolName = ::strdup(className.c_str());
>> +            NameAndAttributes info;
>> +            info.name = symbolName;
>> +            info.attributes = (lto_symbol_attributes)
>> +                (LTO_SYMBOL_PERMISSIONS_DATA |
>> +                 LTO_SYMBOL_DEFINITION_REGULAR |
>> +                 LTO_SYMBOL_SCOPE_DEFAULT);
>> +            _symbols.push_back(info);
>> +            _defines[info.name] = 1;
>> +         }
>> +    }
>> +}
>> +
>> +
>> +// parse i386/ppc ObjC category data structure
>> +void LTOModule::addObjCCategory(GlobalVariable* clgv)
>> +{
>> +    if (ConstantStruct* c =
>> dyn_cast<ConstantStruct>(clgv->getInitializer())) {
>> +        // second slot in __OBJC,__category is pointer to target class
>> name
>> +        std::string targetclassName;
>> +        if ( objcClassNameFromExpression(c->getOperand(1),
>> targetclassName) ) {
>> +            NameAndAttributes info;
>> +            if ( _undefines.find(targetclassName.c_str()) ==
>> _undefines.end() ){
>> +                const char* symbolName =
>> ::strdup(targetclassName.c_str());
>> +                info.name = ::strdup(symbolName);
>> +                info.attributes = LTO_SYMBOL_DEFINITION_UNDEFINED;
>> +                // string is owned by _undefines
>> +               _undefines[info.name] = info;
>> +            }
>> +        }
>> +    }
>> +}
>> +
>> +
>> +// parse i386/ppc ObjC class list data structure
>> +void LTOModule::addObjCClassRef(GlobalVariable* clgv)
>> +{
>> +    std::string targetclassName;
>> +    if ( objcClassNameFromExpression(clgv->getInitializer(),
>> targetclassName) ){
>> +        NameAndAttributes info;
>> +        if ( _undefines.find(targetclassName.c_str()) == _undefines.end()
>> ) {
>> +            const char* symbolName = ::strdup(targetclassName.c_str());
>> +            info.name = ::strdup(symbolName);
>> +            info.attributes = LTO_SYMBOL_DEFINITION_UNDEFINED;
>> +            // string is owned by _undefines
>> +            _undefines[info.name] = info;
>> +        }
>> +    }
>> +}
>> +
>> +
>> +void LTOModule::addDefinedDataSymbol(GlobalValue* v, Mangler& mangler)
>>  {
>>     // add to list of defined symbols
>>     addDefinedSymbol(v, mangler, false);
>>
>> +    // special case i386/ppc ObjC data structures in magic sections
>> +    if ( v->hasSection() ) {
>> +        // special case if this data blob is an ObjC class definition
>> +        if ( v->getSection().compare(0, 15, "__OBJC,__class,") == 0 ) {
>> +            if (GlobalVariable* gv = dyn_cast<GlobalVariable>(v)) {
>> +                addObjCClass(gv);
>> +            }
>> +        }
>> +
>> +        // special case if this data blob is an ObjC category definition
>> +        else if ( v->getSection().compare(0, 18, "__OBJC,__category,") ==
>> 0 ) {
>> +            if (GlobalVariable* gv = dyn_cast<GlobalVariable>(v)) {
>> +                addObjCCategory(gv);
>> +            }
>> +        }
>> +
>> +        // special case if this data blob is the list of referenced
>> classes
>> +        else if ( v->getSection().compare(0, 18, "__OBJC,__cls_refs,") ==
>> 0 ) {
>> +            if (GlobalVariable* gv = dyn_cast<GlobalVariable>(v)) {
>> +                addObjCClassRef(gv);
>> +            }
>> +        }
>> +    }
>> +
>>     // add external symbols referenced by this data.
>>     for (unsigned count = 0, total = v->getNumOperands();
>>                                                 count != total; ++count) {
>> @@ -192,9 +305,13 @@
>>  void LTOModule::addDefinedSymbol(GlobalValue* def, Mangler &mangler,
>>                                 bool isFunction)
>>  {
>> +    // ignore all llvm.* symbols
>> +    if ( strncmp(def->getNameStart(), "llvm.", 5) == 0 )
>> +        return;
>> +
>>     // string is owned by _defines
>>     const char* symbolName = ::strdup(mangler.getValueName(def).c_str());
>> -
>> +
>>     // set alignment part log2() can have rounding errors
>>     uint32_t align = def->getAlignment();
>>     uint32_t attr = align ? CountTrailingZeros_32(def->getAlignment()) :
>> 0;
>> @@ -241,25 +358,28 @@
>>  }
>>
>>  void LTOModule::addAsmGlobalSymbol(const char *name) {
>> -  // string is owned by _defines
>> -  const char *symbolName = ::strdup(name);
>> -  uint32_t attr = LTO_SYMBOL_DEFINITION_REGULAR;
>> -  attr |= LTO_SYMBOL_SCOPE_DEFAULT;
>> -
>> -  // add to table of symbols
>> -  NameAndAttributes info;
>> -  info.name = symbolName;
>> -  info.attributes = (lto_symbol_attributes)attr;
>> -  _symbols.push_back(info);
>> -  _defines[info.name] = 1;
>> +    // only add new define if not already defined
>> +    if ( _defines.count(name, &name[strlen(name)+1]) == 0 )
>> +        return;
>> +
>> +    // string is owned by _defines
>> +    const char *symbolName = ::strdup(name);
>> +    uint32_t attr = LTO_SYMBOL_DEFINITION_REGULAR;
>> +    attr |= LTO_SYMBOL_SCOPE_DEFAULT;
>> +    NameAndAttributes info;
>> +    info.name = symbolName;
>> +    info.attributes = (lto_symbol_attributes)attr;
>> +    _symbols.push_back(info);
>> +    _defines[info.name] = 1;
>>  }
>>
>>  void LTOModule::addPotentialUndefinedSymbol(GlobalValue* decl, Mangler
>> &mangler)
>>  {
>> -   const char* name = mangler.getValueName(decl).c_str();
>>     // ignore all llvm.* symbols
>> -    if ( strncmp(name, "llvm.", 5) == 0 )
>> -      return;
>> +    if ( strncmp(decl->getNameStart(), "llvm.", 5) == 0 )
>> +        return;
>> +
>> +    const char* name = mangler.getValueName(decl).c_str();
>>
>>     // we already have the symbol
>>     if (_undefines.find(name) != _undefines.end())
>> @@ -306,6 +426,14 @@
>>
>>         // Use mangler to add GlobalPrefix to names to match linker names.
>>         Mangler mangler(*_module,
>> _target->getTargetAsmInfo()->getGlobalPrefix());
>> +        // add chars used in ObjC method names so method names aren't
>> mangled
>> +        mangler.markCharAcceptable('[');
>> +        mangler.markCharAcceptable(']');
>> +        mangler.markCharAcceptable('(');
>> +        mangler.markCharAcceptable(')');
>> +        mangler.markCharAcceptable('-');
>> +        mangler.markCharAcceptable('+');
>> +        mangler.markCharAcceptable(' ');
>>
>>         // add functions
>>         for (Module::iterator f = _module->begin(); f != _module->end();
>> ++f) {
>>
>> Modified: llvm/trunk/tools/lto/LTOModule.h
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/lto/LTOModule.h?rev=72700&r1=72699&r2=72700&view=diff
>>
>>
>> ==============================================================================
>> --- llvm/trunk/tools/lto/LTOModule.h (original)
>> +++ llvm/trunk/tools/lto/LTOModule.h Mon Jun  1 15:33:09 2009
>> @@ -77,13 +77,19 @@
>>     void                    addDefinedDataSymbol(llvm::GlobalValue* v,
>>                                                         llvm::Mangler
>> &mangler);
>>     void                    addAsmGlobalSymbol(const char *);
>> +    void                    addObjCClass(llvm::GlobalVariable* clgv);
>> +    void                    addObjCCategory(llvm::GlobalVariable* clgv);
>> +    void                    addObjCClassRef(llvm::GlobalVariable* clgv);
>> +    bool                    objcClassNameFromExpression(llvm::Constant*
>> c,
>> +                                                    std::string& name);
>> +
>>     static bool             isTargetMatch(llvm::MemoryBuffer* memBuffer,
>>                                                     const char*
>> triplePrefix);
>> -
>> +
>>     static LTOModule*       makeLTOModule(llvm::MemoryBuffer* buffer,
>>                                                         std::string&
>> errMsg);
>>     static llvm::MemoryBuffer* makeBuffer(const void* mem, size_t length);
>> -
>> +
>>     typedef llvm::StringMap<uint8_t> StringSet;
>>
>>     struct NameAndAttributes {
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20090601/cb337dc5/attachment.html>


More information about the llvm-commits mailing list