[llvm] r199698 - Tweak the MCExternalSymbolizer to not use the SymbolLookUp() call back

Kevin Enderby enderby at apple.com
Fri Jan 24 10:58:15 PST 2014


Hi Rafael,

I would love to do that.  But it highly depends on how the call backs are used for the llvm disassembler API.  So to generate a test case it would involve writing and building programs to use the API in specific ways.  While this would be nice it seems overkill for a half line of code I added.

I did try to provide detailed comments in the code and a very detailed commit log with examples explaining what I’m doing and why.  As I knew I would not have a test case to offer.  I do make a point to try to create test cases for all my fixes to llvm.  But for some tweaks to the llvm disassembler API like this one it would take a great deal of time and effort to produce a test case.

My thoughts,
Kev

On Jan 23, 2014, at 9:19 PM, Rafael Espíndola <rafael.espindola at gmail.com> wrote:

> We should probably try to figure a way to have tests for this in llvm itself.
> 
> On 20 January 2014 19:23, Kevin Enderby <enderby at apple.com> wrote:
>> Author: enderby
>> Date: Mon Jan 20 18:23:17 2014
>> New Revision: 199698
>> 
>> URL: http://llvm.org/viewvc/llvm-project?rev=199698&view=rev
>> Log:
>> Tweak the MCExternalSymbolizer to not use the SymbolLookUp() call back
>> to not guess at a symbol name in some cases.
>> 
>> The problem is that in object files assembled starting at address 0, when
>> trying to symbolicate something that starts like this:
>> 
>> % cat x.s
>> _t1:
>>        vpshufd $0x0, %xmm1, %xmm0
>> 
>> the symbolic disassembly can end up like this:
>> 
>> % otool -tV x.o
>> x.o:
>> (__TEXT,__text) section
>> _t1:
>> 0000000000000000        vpshufd $_t1, %xmm1, %xmm0
>> 
>> Which is in this case produced incorrect symbolication.
>> 
>> But it is useful in some cases to use the SymbolLookUp() call back
>> to guess at some immediate values.  For example one like this
>> that does not have an external relocation entry:
>> 
>> % cat y.s
>> _t1:
>>        movl    $_d1, %eax
>> .data
>> _d1:    .long   0
>> 
>> % clang -c -arch i386 y.s
>> 
>> % otool -tV y.o
>> y.o:
>> (__TEXT,__text) section
>> _t1:
>> 0000000000000000        movl    $_d1, %eax
>> 
>> % otool -rv y.o
>> y.o:
>> Relocation information (__TEXT,__text) 1 entries
>> address  pcrel length extern type    scattered symbolnum/value
>> 00000001 False long   False  VANILLA False     2 (__DATA,__data)
>> 
>> So the change is based on it is not likely that an immediate Value
>> coming from an instruction field of a width of 1 byte, other than branches
>> and items with relocation, are not likely symbol addresses.
>> 
>> With the change the first case above simply becomes:
>> 
>> % otool -tV x.o
>> x.o:
>> (__TEXT,__text) section
>> _t1:
>> 0000000000000000        vpshufd $0x0, %xmm1, %xmm0
>> 
>> and the second case continues to work as expected.
>> 
>> rdar://14863405
>> 
>> Modified:
>>    llvm/trunk/lib/MC/MCExternalSymbolizer.cpp
>> 
>> Modified: llvm/trunk/lib/MC/MCExternalSymbolizer.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCExternalSymbolizer.cpp?rev=199698&r1=199697&r2=199698&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/MC/MCExternalSymbolizer.cpp (original)
>> +++ llvm/trunk/lib/MC/MCExternalSymbolizer.cpp Mon Jan 20 18:23:17 2014
>> @@ -43,8 +43,19 @@ bool MCExternalSymbolizer::tryAddingSymb
>>       !GetOpInfo(DisInfo, Address, Offset, InstSize, 1, &SymbolicOp)) {
>>     // Clear SymbolicOp.Value from above and also all other fields.
>>     std::memset(&SymbolicOp, '\0', sizeof(struct LLVMOpInfo1));
>> -    if (!SymbolLookUp)
>> +
>> +    // At this point, GetOpInfo() did not find any relocation information about
>> +    // this operand and we are left to use the SymbolLookUp() call back to guess
>> +    // if the Value is the address of a symbol.  In the case this is a branch
>> +    // that always makes sense to guess.  But in the case of an immediate it is
>> +    // a bit more questionable if it is an address of a symbol or some other
>> +    // reference.  So if the immediate Value comes from a width of 1 byte,
>> +    // InstSize, we will not guess it is an address of a symbol.  Because in
>> +    // object files assembled starting at address 0 this usually leads to
>> +    // incorrect symbolication.
>> +    if (!SymbolLookUp || (InstSize == 1 && !IsBranch))
>>       return false;
>> +
>>     uint64_t ReferenceType;
>>     if (IsBranch)
>>        ReferenceType = LLVMDisassembler_ReferenceType_In_Branch;
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits





More information about the llvm-commits mailing list