[llvm-commits] [llvm] r72700 - in /llvm/trunk/tools/lto: LTOModule.cpp LTOModule.h
Nick Kledzik
kledzik at apple.com
Mon Jun 1 16:46:15 PDT 2009
On Jun 1, 2009, at 4:20 PM, Nick Lewycky wrote:
> 2009/6/1 Nick Kledzik <kledzik at apple.com>
>
> On Jun 1, 2009, at 2:11 PM, Nick Lewycky wrote:
>
>> 2009/6/1 Nick Kledzik <kledzik at apple.com>
>> Author: kledzik
>> Date: Mon Jun 1 15:33:09 2009
>> New Revision: 72700
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=72700&view=rev
>> Log:
>> <rdar://problem/6927148> libLTO needs to handle i386 magic objc
>> class symbols
>> Parse __OBJC data structures and synthesize magic .objc_ symbols.
>> Also, alter mangler so that objc method names are readable.
>>
>> Hi Nick, could you please elaborate on why exactly libLTO needs to
>> change? While I don't doubt that this fixes a bug, I can't see how
>> this language-specific logic could possibly belong in libLTO.
>
> It is not just language specific. It is only for Objective-C for
> ppc or i386 on Darwin. My thinking was that the extra logic is
> only executed if the GlobalVariable has a custom section with a
> specific name, that it would not impact other languages or
> architectures.
>
> That's a good policy.
>
> This issue is that the old ObjC object format did some strange
> contortions to avoid real linker symbols. For instance the ObjC
> class data structure is allocated statically in the executable that
> defines it. That data structures contains a pointer to it
> superclass. But instead of just initializing that part of the
> struct to the address of its superclass, and letting the static and
> dynamic linkers do the rest, the runtime works by having that field
> point to a C-string that is the name of the superclass on disk. At
> runtime the objc initialization swaps that pointer out to point to
> the actual super class. As far as the linkers know it is just a
> pointer to a string. But someone wanted the linker to issue errors
> at build time if the superclass was not found. So they figured out
> a way in mach-o object format to use an absolute symbol
> (.objc_class_name_Foo = 0) and a floating reference
> ( .reference .objc_class_name_Bar) to trick the linker into erroring
> when a class was missing. This patch emulates that same behavior.
>
> Tricky!
>
> I haven't thought it through fully but it sounds like you may be
> able to simulate some of this with the private linkage type which is
> described in LangRef as "This doesn't show up in any symbol table in
> the object file."
We might be able to have the front end name the class data structure
with a .objc_class_name name, but the real problem is the contents of
the various data structures that need to both point to a string and
(for error checking and archive loading) reference a .objc_class_name
symbol.
>
>
> At the very least, I'd appreciate a comment block in libLTO
> explaining why this stuff is there. Pretty much the same thing you
> wrote in the paragraph above.
I just committed that as a comment. Thanks for nudging me to do that.
-Nick
>
>
> In addition, when processing mach-o files, the current Darwin linker
> ignores those absolute and floating reference symbols, and instead
> parses the __OBJC,__class data structures and infers those symbols.
> The libLTO.dylib interfaces does not allow the linker to view the
> contents of a bitcode file, just its symbols. So having libLTO
> synthesize those symbols fits well with the Darwin linker.
>
> Would you prefer if the changes were wrapped in a isTargetDarwin()
> test?
>
> It doesn't much matter because nobody else will call these new add*
> methods. (Right?) The mangler change has me a little concerned but I
> can wrap that in an isTargetDarwin() myself if I find that it is
> responsible for breaking things on me.
>
> Also, thanks for taking the time to write up the rationale!
>
> Nick
>
>
> -Nick
>
>
>
>>
>>
>>
>>
>> Modified:
>> llvm/trunk/tools/lto/LTOModule.cpp
>> llvm/trunk/tools/lto/LTOModule.h
>>
>> Modified: llvm/trunk/tools/lto/LTOModule.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/lto/LTOModule.cpp?rev=72700&r1=72699&r2=72700&view=diff
>>
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =====================================================================
>> --- llvm/trunk/tools/lto/LTOModule.cpp (original)
>> +++ llvm/trunk/tools/lto/LTOModule.cpp Mon Jun 1 15:33:09 2009
>> @@ -14,6 +14,7 @@
>>
>> #include "LTOModule.h"
>>
>> +#include "llvm/Constants.h"
>> #include "llvm/Module.h"
>> #include "llvm/ModuleProvider.h"
>> #include "llvm/ADT/OwningPtr.h"
>> @@ -176,11 +177,123 @@
>> }
>> }
>>
>> -void LTOModule::addDefinedDataSymbol(GlobalValue* v, Mangler
>> &mangler)
>> +// get string that data pointer points to
>> +bool LTOModule::objcClassNameFromExpression(Constant* c,
>> std::string& name)
>> +{
>> + if (ConstantExpr* ce = dyn_cast<ConstantExpr>(c)) {
>>
>> Is there any reason not to just assert on this? Will it ever really
>> be called on constants that aren't known to be objcClassNames?
>>
>>
>> + Constant* op = ce->getOperand(0);
>> + if (GlobalVariable* gvn = dyn_cast<GlobalVariable>(op)) {
>> + Constant* cn = gvn->getInitializer();
>> + if (ConstantArray* ca = dyn_cast<ConstantArray>(cn)) {
>> + if ( ca->isCString() ) {
>> + name = ".objc_class_name_" + ca->getAsString();
>> + return true;
>> + }
>> + }
>> + }
>> + }
>> + return false;
>> +}
>> +
>> +// parse i386/ppc ObjC class data structure
>> +void LTOModule::addObjCClass(GlobalVariable* clgv)
>> +{
>> + if (ConstantStruct* c = dyn_cast<ConstantStruct>(clgv-
>> >getInitializer())) {
>>
>> Here again, perhaps it should just assert that this is true?
>>
>>
>> + // second slot in __OBJC,__class is pointer to superclass
>> name
>> + std::string superclassName;
>> + if ( objcClassNameFromExpression(c->getOperand(1),
>> superclassName) ) {
>> + NameAndAttributes info;
>> + if ( _undefines.find(superclassName.c_str()) ==
>> _undefines.end() ) {
>> + const char* symbolName
>> = ::strdup(superclassName.c_str());
>> + info.name = ::strdup(symbolName);
>> + info.attributes = LTO_SYMBOL_DEFINITION_UNDEFINED;
>> + // string is owned by _undefines
>> + _undefines[info.name] = info;
>>
>> Perhaps the non-libLTO side of the Apple linker should ignore
>> symbols it knows are objC specific? Or perhaps you could run a
>> domain-specific LLVM pass which would delete these globals as
>> needed? If somehow you need them to exist but still be marked
>> undefined, perhaps libLTO should expose an interface for marking
>> symbols undefined instead of an interface specific to Apple's
>> implementation of the objC language?
>>
>> It just seems like this is breaking encapsulation, the linker has
>> linking rules that don't much care what the source language was.
>>
>> Nick
>>
>>
>> + }
>> + }
>> + // third slot in __OBJC,__class is pointer to class name
>> + std::string className;
>> + if ( objcClassNameFromExpression(c->getOperand(2),
>> className) ) {
>> + const char* symbolName = ::strdup(className.c_str());
>> + NameAndAttributes info;
>> + info.name = symbolName;
>> + info.attributes = (lto_symbol_attributes)
>> + (LTO_SYMBOL_PERMISSIONS_DATA |
>> + LTO_SYMBOL_DEFINITION_REGULAR |
>> + LTO_SYMBOL_SCOPE_DEFAULT);
>> + _symbols.push_back(info);
>> + _defines[info.name] = 1;
>> + }
>> + }
>> +}
>> +
>> +
>> +// parse i386/ppc ObjC category data structure
>> +void LTOModule::addObjCCategory(GlobalVariable* clgv)
>> +{
>> + if (ConstantStruct* c = dyn_cast<ConstantStruct>(clgv-
>> >getInitializer())) {
>> + // second slot in __OBJC,__category is pointer to target
>> class name
>> + std::string targetclassName;
>> + if ( objcClassNameFromExpression(c->getOperand(1),
>> targetclassName) ) {
>> + NameAndAttributes info;
>> + if ( _undefines.find(targetclassName.c_str()) ==
>> _undefines.end() ){
>> + const char* symbolName
>> = ::strdup(targetclassName.c_str());
>> + info.name = ::strdup(symbolName);
>> + info.attributes = LTO_SYMBOL_DEFINITION_UNDEFINED;
>> + // string is owned by _undefines
>> + _undefines[info.name] = info;
>> + }
>> + }
>> + }
>> +}
>> +
>> +
>> +// parse i386/ppc ObjC class list data structure
>> +void LTOModule::addObjCClassRef(GlobalVariable* clgv)
>> +{
>> + std::string targetclassName;
>> + if ( objcClassNameFromExpression(clgv->getInitializer(),
>> targetclassName) ){
>> + NameAndAttributes info;
>> + if ( _undefines.find(targetclassName.c_str()) ==
>> _undefines.end() ) {
>> + const char* symbolName
>> = ::strdup(targetclassName.c_str());
>> + info.name = ::strdup(symbolName);
>> + info.attributes = LTO_SYMBOL_DEFINITION_UNDEFINED;
>> + // string is owned by _undefines
>> + _undefines[info.name] = info;
>> + }
>> + }
>> +}
>> +
>> +
>> +void LTOModule::addDefinedDataSymbol(GlobalValue* v, Mangler&
>> mangler)
>> {
>> // add to list of defined symbols
>> addDefinedSymbol(v, mangler, false);
>>
>> + // special case i386/ppc ObjC data structures in magic sections
>> + if ( v->hasSection() ) {
>> + // special case if this data blob is an ObjC class
>> definition
>> + if ( v->getSection().compare(0, 15, "__OBJC,__class,") ==
>> 0 ) {
>> + if (GlobalVariable* gv = dyn_cast<GlobalVariable>(v)) {
>> + addObjCClass(gv);
>> + }
>> + }
>> +
>> + // special case if this data blob is an ObjC category
>> definition
>> + else if ( v->getSection().compare(0, 18,
>> "__OBJC,__category,") == 0 ) {
>> + if (GlobalVariable* gv = dyn_cast<GlobalVariable>(v)) {
>> + addObjCCategory(gv);
>> + }
>> + }
>> +
>> + // special case if this data blob is the list of
>> referenced classes
>> + else if ( v->getSection().compare(0, 18,
>> "__OBJC,__cls_refs,") == 0 ) {
>> + if (GlobalVariable* gv = dyn_cast<GlobalVariable>(v)) {
>> + addObjCClassRef(gv);
>> + }
>> + }
>> + }
>> +
>> // add external symbols referenced by this data.
>> for (unsigned count = 0, total = v->getNumOperands();
>> count != total; +
>> +count) {
>> @@ -192,9 +305,13 @@
>> void LTOModule::addDefinedSymbol(GlobalValue* def, Mangler &mangler,
>> bool isFunction)
>> {
>> + // ignore all llvm.* symbols
>> + if ( strncmp(def->getNameStart(), "llvm.", 5) == 0 )
>> + return;
>> +
>> // string is owned by _defines
>> const char* symbolName
>> = ::strdup(mangler.getValueName(def).c_str());
>> -
>> +
>> // set alignment part log2() can have rounding errors
>> uint32_t align = def->getAlignment();
>> uint32_t attr = align ? CountTrailingZeros_32(def-
>> >getAlignment()) : 0;
>> @@ -241,25 +358,28 @@
>> }
>>
>> void LTOModule::addAsmGlobalSymbol(const char *name) {
>> - // string is owned by _defines
>> - const char *symbolName = ::strdup(name);
>> - uint32_t attr = LTO_SYMBOL_DEFINITION_REGULAR;
>> - attr |= LTO_SYMBOL_SCOPE_DEFAULT;
>> -
>> - // add to table of symbols
>> - NameAndAttributes info;
>> - info.name = symbolName;
>> - info.attributes = (lto_symbol_attributes)attr;
>> - _symbols.push_back(info);
>> - _defines[info.name] = 1;
>> + // only add new define if not already defined
>> + if ( _defines.count(name, &name[strlen(name)+1]) == 0 )
>> + return;
>> +
>> + // string is owned by _defines
>> + const char *symbolName = ::strdup(name);
>> + uint32_t attr = LTO_SYMBOL_DEFINITION_REGULAR;
>> + attr |= LTO_SYMBOL_SCOPE_DEFAULT;
>> + NameAndAttributes info;
>> + info.name = symbolName;
>> + info.attributes = (lto_symbol_attributes)attr;
>> + _symbols.push_back(info);
>> + _defines[info.name] = 1;
>> }
>>
>> void LTOModule::addPotentialUndefinedSymbol(GlobalValue* decl,
>> Mangler &mangler)
>> {
>> - const char* name = mangler.getValueName(decl).c_str();
>> // ignore all llvm.* symbols
>> - if ( strncmp(name, "llvm.", 5) == 0 )
>> - return;
>> + if ( strncmp(decl->getNameStart(), "llvm.", 5) == 0 )
>> + return;
>> +
>> + const char* name = mangler.getValueName(decl).c_str();
>>
>> // we already have the symbol
>> if (_undefines.find(name) != _undefines.end())
>> @@ -306,6 +426,14 @@
>>
>> // Use mangler to add GlobalPrefix to names to match linker
>> names.
>> Mangler mangler(*_module, _target->getTargetAsmInfo()-
>> >getGlobalPrefix());
>> + // add chars used in ObjC method names so method names
>> aren't mangled
>> + mangler.markCharAcceptable('[');
>> + mangler.markCharAcceptable(']');
>> + mangler.markCharAcceptable('(');
>> + mangler.markCharAcceptable(')');
>> + mangler.markCharAcceptable('-');
>> + mangler.markCharAcceptable('+');
>> + mangler.markCharAcceptable(' ');
>>
>> // add functions
>> for (Module::iterator f = _module->begin(); f != _module-
>> >end(); ++f) {
>>
>> Modified: llvm/trunk/tools/lto/LTOModule.h
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/lto/LTOModule.h?rev=72700&r1=72699&r2=72700&view=diff
>>
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =====================================================================
>> --- llvm/trunk/tools/lto/LTOModule.h (original)
>> +++ llvm/trunk/tools/lto/LTOModule.h Mon Jun 1 15:33:09 2009
>> @@ -77,13 +77,19 @@
>> void addDefinedDataSymbol(llvm::GlobalValue*
>> v,
>>
>> llvm::Mangler &mangler);
>> void addAsmGlobalSymbol(const char *);
>> + void addObjCClass(llvm::GlobalVariable*
>> clgv);
>> + void addObjCCategory(llvm::GlobalVariable*
>> clgv);
>> + void addObjCClassRef(llvm::GlobalVariable*
>> clgv);
>> + bool
>> objcClassNameFromExpression(llvm::Constant* c,
>> + std::string&
>> name);
>> +
>> static bool isTargetMatch(llvm::MemoryBuffer*
>> memBuffer,
>> const char*
>> triplePrefix);
>> -
>> +
>> static LTOModule* makeLTOModule(llvm::MemoryBuffer* buffer,
>>
>> std::string& errMsg);
>> static llvm::MemoryBuffer* makeBuffer(const void* mem, size_t
>> length);
>> -
>> +
>> typedef llvm::StringMap<uint8_t> StringSet;
>>
>> struct NameAndAttributes {
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20090601/a311c75a/attachment.html>
More information about the llvm-commits
mailing list