[llvm] r199698 - Tweak the MCExternalSymbolizer to not use the SymbolLookUp() call back
Kevin Enderby
enderby at apple.com
Fri Jan 24 10:58:15 PST 2014
Hi Rafael,
I would love to do that. But it highly depends on how the call backs are used for the llvm disassembler API. So to generate a test case it would involve writing and building programs to use the API in specific ways. While this would be nice it seems overkill for a half line of code I added.
I did try to provide detailed comments in the code and a very detailed commit log with examples explaining what I’m doing and why. As I knew I would not have a test case to offer. I do make a point to try to create test cases for all my fixes to llvm. But for some tweaks to the llvm disassembler API like this one it would take a great deal of time and effort to produce a test case.
My thoughts,
Kev
On Jan 23, 2014, at 9:19 PM, Rafael Espíndola <rafael.espindola at gmail.com> wrote:
> We should probably try to figure a way to have tests for this in llvm itself.
>
> On 20 January 2014 19:23, Kevin Enderby <enderby at apple.com> wrote:
>> Author: enderby
>> Date: Mon Jan 20 18:23:17 2014
>> New Revision: 199698
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=199698&view=rev
>> Log:
>> Tweak the MCExternalSymbolizer to not use the SymbolLookUp() call back
>> to not guess at a symbol name in some cases.
>>
>> The problem is that in object files assembled starting at address 0, when
>> trying to symbolicate something that starts like this:
>>
>> % cat x.s
>> _t1:
>> vpshufd $0x0, %xmm1, %xmm0
>>
>> the symbolic disassembly can end up like this:
>>
>> % otool -tV x.o
>> x.o:
>> (__TEXT,__text) section
>> _t1:
>> 0000000000000000 vpshufd $_t1, %xmm1, %xmm0
>>
>> Which is in this case produced incorrect symbolication.
>>
>> But it is useful in some cases to use the SymbolLookUp() call back
>> to guess at some immediate values. For example one like this
>> that does not have an external relocation entry:
>>
>> % cat y.s
>> _t1:
>> movl $_d1, %eax
>> .data
>> _d1: .long 0
>>
>> % clang -c -arch i386 y.s
>>
>> % otool -tV y.o
>> y.o:
>> (__TEXT,__text) section
>> _t1:
>> 0000000000000000 movl $_d1, %eax
>>
>> % otool -rv y.o
>> y.o:
>> Relocation information (__TEXT,__text) 1 entries
>> address pcrel length extern type scattered symbolnum/value
>> 00000001 False long False VANILLA False 2 (__DATA,__data)
>>
>> So the change is based on it is not likely that an immediate Value
>> coming from an instruction field of a width of 1 byte, other than branches
>> and items with relocation, are not likely symbol addresses.
>>
>> With the change the first case above simply becomes:
>>
>> % otool -tV x.o
>> x.o:
>> (__TEXT,__text) section
>> _t1:
>> 0000000000000000 vpshufd $0x0, %xmm1, %xmm0
>>
>> and the second case continues to work as expected.
>>
>> rdar://14863405
>>
>> Modified:
>> llvm/trunk/lib/MC/MCExternalSymbolizer.cpp
>>
>> Modified: llvm/trunk/lib/MC/MCExternalSymbolizer.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCExternalSymbolizer.cpp?rev=199698&r1=199697&r2=199698&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/MC/MCExternalSymbolizer.cpp (original)
>> +++ llvm/trunk/lib/MC/MCExternalSymbolizer.cpp Mon Jan 20 18:23:17 2014
>> @@ -43,8 +43,19 @@ bool MCExternalSymbolizer::tryAddingSymb
>> !GetOpInfo(DisInfo, Address, Offset, InstSize, 1, &SymbolicOp)) {
>> // Clear SymbolicOp.Value from above and also all other fields.
>> std::memset(&SymbolicOp, '\0', sizeof(struct LLVMOpInfo1));
>> - if (!SymbolLookUp)
>> +
>> + // At this point, GetOpInfo() did not find any relocation information about
>> + // this operand and we are left to use the SymbolLookUp() call back to guess
>> + // if the Value is the address of a symbol. In the case this is a branch
>> + // that always makes sense to guess. But in the case of an immediate it is
>> + // a bit more questionable if it is an address of a symbol or some other
>> + // reference. So if the immediate Value comes from a width of 1 byte,
>> + // InstSize, we will not guess it is an address of a symbol. Because in
>> + // object files assembled starting at address 0 this usually leads to
>> + // incorrect symbolication.
>> + if (!SymbolLookUp || (InstSize == 1 && !IsBranch))
>> return false;
>> +
>> uint64_t ReferenceType;
>> if (IsBranch)
>> ReferenceType = LLVMDisassembler_ReferenceType_In_Branch;
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list