[llvm-dev] Relocation design of different architecture

Siddharth Shankar Swain via llvm-dev llvm-dev at lists.llvm.org
Thu Apr 27 00:35:17 PDT 2017

Hi all,
So i was implementing dynamic linking and relocation for Hexagon target.
After implementing runtimedynamic linking and relocation, i wrote some test
cases.When i run them it shows "Program used external function 'printf'
which could not be resolved!". Can anyone help as why such errors come and
how to resolve them ?

On Fri, Apr 21, 2017 at 11:16 PM, Siddharth Shankar Swain <
h2015096 at pilani.bits-pilani.ac.in> wrote:

> Thanks. I am just trying to find a relocation and linking design for
> Hexagon architecture, whether to follow the MIPS style of relocation or
> other architecture style of relocation. Thats my question . Thats why i was
> asking about the functions and their differences  Please guide.
> Thanks,
> Siddharth
> On Fri, Apr 21, 2017 at 8:37 PM, mats petersson <mats at planetcatfish.com>
> wrote:
>> If you look at the actual code, it's fairly obvious that the approach is
>> different, in that the COFF versions have a single architecture per class,
>> the ELF supports many different architectures in the same source code.
>> I'm not going to go through hundreds of lines of code and explain exactly
>> how they are different (mostly0 because I'm lazy, but partly becasue I
>> don't actually KNOW this code - I'm just reading it with a moderate
>> understanding of the overall goal and general understanding of how the
>> process of linking and loading works in other software systems)
>> It is not clear to me why you are asking these questions. Are you
>> planning to change/extend some of this code, or doing something else?
>> Explaining what you want to achieve, rather than asking very open-ended
>> questions would probably be a better way to reach your own goal. I may not
>> be able to give you an answer, but there are people on this mailing list
>> that has written this code and/or are currently maintaining it. They are
>> perhaps busy and may not necessarily enter into generic questions about the
>> overall code, but specific questions will get more attention.
>> --
>> Mats
>> On 21 April 2017 at 14:54, Siddharth Shankar Swain <
>> h2015096 at pilani.bits-pilani.ac.in> wrote:
>>> Thanks for reply, it was really helpful. Can u just be more specific and
>>> tell about processRelocationRef() and resolveRelocation() in
>>> Targets/RuntimeDyld(objectfile format)(arch).h and also in
>>> RuntimeDyldELF.cpp and how the same function is implemented in different
>>> ways in both the files ?
>>> Thanks,
>>> Siddharth
>>> On Thu, Apr 20, 2017 at 8:16 PM, mats petersson <mats at planetcatfish.com>
>>> wrote:
>>>> (Again: Please always REPLY to all recipients, including llvm-dev,
>>>> unless there is VERY specific reasons not to)
>>>> The ELF support for relocation is all baked into a single, large
>>>> function for all different processor architectures. In my humble opinion,
>>>> it would make the code simpler and more readable to implement this code as
>>>> multiple derived classes based on architecture (there are several "if(Arch
>>>> == ...)" or similar, then a large section of code for that architecture).
>>>> But I've not worked on this code personally, and this is just from a basic
>>>> "look at the code for a few minutes to understand it". It's probably one of
>>>> those things that has evolved over time - originally only one or two
>>>> processor architectures where supported, then someone added one or two
>>>> more, and eventually you have a function that is ~600 lines of code and a
>>>> file that is over 1800 lines, compared to the COFF_x86_64 class that is
>>>> just over 200 lines for the entire file. There are positive and negative
>>>> things about having large or small functions, but my personal choice would
>>>> be a split - that's not to say that such a split ends up "near the top" of
>>>> the priority list of "things to do to make LLVM better" - presumably the
>>>> code works as it is, so changing it MAY break things.
>>>> --
>>>> Mats
>>>> On 20 April 2017 at 15:14, Siddharth Shankar Swain <
>>>> h2015096 at pilani.bits-pilani.ac.in> wrote:
>>>>> Thanks for the reply. It was really helpful. So to be more specific
>>>>> there is a processRelocationRef() and resolveRelocation() in
>>>>> Targets/RuntimeDyld(objectfile format)(arch).h and also in
>>>>> RuntimeDyldELF.cpp . Whats the different between these to and for what diff
>>>>> purpose they are used ?
>>>>> Thanks,
>>>>> Siddharth
>>>>> On Thu, Apr 20, 2017 at 7:10 PM, mats petersson <
>>>>> mats at planetcatfish.com> wrote:
>>>>>> Note: Please include the mailing list when replying to discussions,
>>>>>> as someone else may well want to see the discussion, and may be better
>>>>>> placed to answer.
>>>>>> Like I've tried to explain, there is a generic piece of code that
>>>>>> understands how to load code in general (the class RuntimeDyld and related
>>>>>> bits), and then specific implementations that derive from a base class to
>>>>>> do the specific relocation and exception handling for that particular
>>>>>> hardware and file-format - for each supported processor architecture and
>>>>>> file-format, there needs to be a specific class that implements some
>>>>>> functions (processRelocationRef is one of those). Technically, it looks
>>>>>> like it's using a "pImpl" pattern, but the basic principle is the same
>>>>>> either way - generic code handles the generic case, a derived class that
>>>>>> understands how to deal with the specifics is used to actually perform
>>>>>> relocations in that particular case.
>>>>>> Exception handling is also target-specific, so in x86-64 and i386,
>>>>>> how exception information is stored and used is different (I don't know the
>>>>>> exact details in this case as COFF is the file-format used on Windows, and
>>>>>> it's been at least 8 or 10 years since I did any programming at all on a
>>>>>> Windows machine, I know that i386 on Linux uses an exception table, and
>>>>>> x86-64 on linux essentially has debug information [DWARF tables]). The
>>>>>> exception information is used to determine how to unwind the stack and
>>>>>> destroy objects on the way back to the "catch" for that particular
>>>>>> exception. There is code required both to load the exception tables into
>>>>>> memory, and to interpret/use those tables - but I'm not overly familiar
>>>>>> with how that works for JIT'd code. [Actually, looking at the code for
>>>>>> x86-64, it looks like it's mainly SEH (Structured Exception Handling) that
>>>>>> is dealt with - the overall concept still applies, but SEH is a Windows
>>>>>> concept for handling exceptions, which includes hardware exceptions such as
>>>>>> integer division by zero and memory access exceptions - regular C++
>>>>>> exceptions are dealt with separately, and that is what uses what I
>>>>>> described for Linux earlier in this paragraph].
>>>>>> As to WHY different architectures use different relocations and
>>>>>> exception handling tables, that's an ABI design issue - a convention that
>>>>>> is based on the needs and requirements for each architecture, and a bunch
>>>>>> of compromises between simplicity (a very simple table is easy to
>>>>>> construct), space (simple table takes up more space than a more complex
>>>>>> table construction - like a zip file or a text file - the zip file is more
>>>>>> complicated to read, but takes up a lot less space) and code complexity
>>>>>> (save space in table, more complex code most likely). Either way, for a
>>>>>> given platform (OS, Processor, file format), there is a given ABI for
>>>>>> handling exceptions. The loader needs to load the table in the correct way
>>>>>> into the correct part of memory, and when an exception is thrown, the
>>>>>> table(s) need to be understood and acted upon to find the way back to the
>>>>>> relevant place where the exception is caught.
>>>>>> The fact that the classes are declared in different files is similar
>>>>>> to my simple animal example, where you'd have a animal.h for the base
>>>>>> class, a cat.h, dog.h and fish.h for the actual implementations. Obviously,
>>>>>> the specific implementations for the RuntimeDyld belongs in "Target"
>>>>>> because they are dependent on the actual target (which is the combination
>>>>>> of fileformat, OS and processor architecture).
>>>>>> --
>>>>>> Mats
>>>>>> On 20 April 2017 at 14:04, Siddharth Shankar Swain <
>>>>>> h2015096 at pilani.bits-pilani.ac.in> wrote:
>>>>>>> So RuntimeDyldELF.cpp or RuntimeDyldCOFF.cpp or
>>>>>>> RuntimeDyldMachO.cpp  are doing relocation and linking for specific object
>>>>>>> file format and all architectures using that object file format. Am i
>>>>>>> correct? If that is so then these  .cpp files are not using any header file
>>>>>>> in Targets/ so what are these header files in Targets/ made for ? Another
>>>>>>> thing is that why these header files in Targets/ handling exception and
>>>>>>> that too using a different concept of exception frames and exception
>>>>>>> tables. Please guide
>>>>>>> Thanks,
>>>>>>> Siddharth
>>>>>>> On Thu, Apr 20, 2017 at 6:06 PM, mats petersson <
>>>>>>> mats at planetcatfish.com> wrote:
>>>>>>>> Basic Object Oriented design uses a derived class to implement a
>>>>>>>> functionality of the generic case. It's the same basic principle as:
>>>>>>>> class Animal
>>>>>>>> {
>>>>>>>>     void virtual Say() = 0;
>>>>>>>> };
>>>>>>>> class Cat: public Animal
>>>>>>>> {
>>>>>>>>     void Say() override { cout << "Meow!" << endl; }
>>>>>>>> }
>>>>>>>> class Dog: public Animal
>>>>>>>> {
>>>>>>>>     void Say() override { cout << "Woof!" << endl; }
>>>>>>>> }
>>>>>>>> class Fish: public Animal
>>>>>>>> {
>>>>>>>>     void Say() override { cout << "Blub!" << endl; }
>>>>>>>> }
>>>>>>>> In this case, different types of COFF-architectures use different
>>>>>>>> relocation entries, and based on the architecture, a specific
>>>>>>>> implementation of the RelocationDyldCOFF class is created to perform the
>>>>>>>> relocation.
>>>>>>>> See http://llvm.org/docs/doxygen/html/classllvm_1_1RuntimeDyldCO
>>>>>>>> FF.html for a class diagram of how this is done.
>>>>>>>> The generic code in RuntimeDyld*.cpp only knows that relocations
>>>>>>>> exists, and that they need to be dealt with. Not HOW to actually perform
>>>>>>>> the relocation - just like "Animal" doesn't know what a cat or a dog
>>>>>>>> "says". The processRelocationRef() is called here:
>>>>>>>> http://llvm.org/docs/doxygen/html/RuntimeDyld_8cpp_source.ht
>>>>>>>> ml#l00251
>>>>>>>> Again, it's not clear exactly what you are asking for, so I'm not
>>>>>>>> sure whether my explanation is helpful or not...
>>>>>>>> --
>>>>>>>> Mats
>>>>>>>> On 20 April 2017 at 12:05, Siddharth Shankar Swain <
>>>>>>>> h2015096 at pilani.bits-pilani.ac.in> wrote:
>>>>>>>>> Thanks for the reply. I was just asking about in general whatever
>>>>>>>>> header files are there in Targets/ for different architectures are not
>>>>>>>>> including any function except this processRelocationRef() to be used in
>>>>>>>>> RuntimeDyldELF.cpp or RuntimeDyldCOFF.cpp or RuntimeDyldMachO.cpp and i
>>>>>>>>> think these files are the ones which are actually doing the relocation and
>>>>>>>>> linking work. So what purpose do these header files inside Targets/
>>>>>>>>> actually serve. Also they include exception handling in form of exception
>>>>>>>>> frames, So can u guide on this issue ?
>>>>>>>>> Thanks,
>>>>>>>>> Siddharth
>>>>>>>>> On Thu, Apr 20, 2017 at 4:02 PM, mats petersson <
>>>>>>>>> mats at planetcatfish.com> wrote:
>>>>>>>>>> The x86_64 and i386 architectures have different actual
>>>>>>>>>> relocation records. So if you build code for i386, you need one
>>>>>>>>>> processRelocationRef() function (handling the relevant relocations in that
>>>>>>>>>> model), and when producing code for x86_64, there are different relocation
>>>>>>>>>> records. The two files contain the derived form of the class that processes
>>>>>>>>>> the relocation records when dynamically loading JITed code in LLVM - mainly
>>>>>>>>>> implementing the two different forms of symbol entries that refer to the
>>>>>>>>>> relocations - i386 uses COFF::IMAGE_REL_I386_*, in x86_64 the relocation
>>>>>>>>>> types are COFF::IMAGE_REL_AMD64_*.
>>>>>>>>>> Conceptually, they do the same thing, it's the details of exactly
>>>>>>>>>> how and where the relocation ends up and how it's recorded by the linker
>>>>>>>>>> that differs.
>>>>>>>>>> Theoretically, one could probably construct a loadable file that
>>>>>>>>>> doesn't care what architecture it is for, but it would end up with a lot of
>>>>>>>>>> redundant & overlapping functionality, and the code to handle every
>>>>>>>>>> different architecture in one huge switch-statement would be rather complex
>>>>>>>>>> (and long!). So splitting the functionality per architecture helps make the
>>>>>>>>>> code clear.
>>>>>>>>>> If you need further help to understand the code, you'll probably
>>>>>>>>>> need to ask a more concrete question, as it is probably not possible to
>>>>>>>>>> describe all the relevant information on this subject in less than 200
>>>>>>>>>> pages, never mind a simple email-thread.
>>>>>>>>>> --
>>>>>>>>>> Mats
>>>>>>>>>> On 20 April 2017 at 11:13, Siddharth Shankar Swain via llvm-dev <
>>>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>> Can anyone explain in lib/ExecutionEngine/RuntimeDyld/Targets/
>>>>>>>>>>> the header files included for different architectures like
>>>>>>>>>>> RuntimeDyldCOFFX86_64.h or RuntimeDyldCOFFI386.h etc, what is the
>>>>>>>>>>> connection of these files for relocation and linking as the linking and
>>>>>>>>>>> relocation for diff architecture is done in RuntimeDyldELF.cpp,
>>>>>>>>>>> RuntimeDyldCOFF.cpp  and it doesn't use any function from these header file
>>>>>>>>>>> except the processRelocationRef(). The header files in Targets/ also
>>>>>>>>>>> handles exceptions, so what is the need for that in relocation and linking
>>>>>>>>>>> process ? Also please help with what this processRelocationRef() actually
>>>>>>>>>>> does ? . Please guide.
>>>>>>>>>>> sincerely,
>>>>>>>>>>> Siddharth
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170427/09915641/attachment.html>

More information about the llvm-dev mailing list