[llvm-dev] Relocation design of different architecture

Siddharth Shankar Swain via llvm-dev llvm-dev at lists.llvm.org
Fri Apr 21 06:54:00 PDT 2017


Thanks for reply, it was really helpful. Can u just be more specific and
tell about processRelocationRef() and resolveRelocation() in
Targets/RuntimeDyld(objectfile format)(arch).h and also in
RuntimeDyldELF.cpp and how the same function is implemented in different
ways in both the files ?
Thanks,
Siddharth

On Thu, Apr 20, 2017 at 8:16 PM, mats petersson <mats at planetcatfish.com>
wrote:

> (Again: Please always REPLY to all recipients, including llvm-dev, unless
> there is VERY specific reasons not to)
>
> The ELF support for relocation is all baked into a single, large function
> for all different processor architectures. In my humble opinion, it would
> make the code simpler and more readable to implement this code as multiple
> derived classes based on architecture (there are several "if(Arch == ...)"
> or similar, then a large section of code for that architecture). But I've
> not worked on this code personally, and this is just from a basic "look at
> the code for a few minutes to understand it". It's probably one of those
> things that has evolved over time - originally only one or two processor
> architectures where supported, then someone added one or two more, and
> eventually you have a function that is ~600 lines of code and a file that
> is over 1800 lines, compared to the COFF_x86_64 class that is just over 200
> lines for the entire file. There are positive and negative things about
> having large or small functions, but my personal choice would be a split -
> that's not to say that such a split ends up "near the top" of the priority
> list of "things to do to make LLVM better" - presumably the code works as
> it is, so changing it MAY break things.
>
> --
> Mats
>
> On 20 April 2017 at 15:14, Siddharth Shankar Swain <
> h2015096 at pilani.bits-pilani.ac.in> wrote:
>
>> Thanks for the reply. It was really helpful. So to be more specific there
>> is a processRelocationRef() and resolveRelocation() in
>> Targets/RuntimeDyld(objectfile format)(arch).h and also in
>> RuntimeDyldELF.cpp . Whats the different between these to and for what diff
>> purpose they are used ?
>>
>> Thanks,
>> Siddharth
>>
>> On Thu, Apr 20, 2017 at 7:10 PM, mats petersson <mats at planetcatfish.com>
>> wrote:
>>
>>> Note: Please include the mailing list when replying to discussions, as
>>> someone else may well want to see the discussion, and may be better placed
>>> to answer.
>>>
>>> Like I've tried to explain, there is a generic piece of code that
>>> understands how to load code in general (the class RuntimeDyld and related
>>> bits), and then specific implementations that derive from a base class to
>>> do the specific relocation and exception handling for that particular
>>> hardware and file-format - for each supported processor architecture and
>>> file-format, there needs to be a specific class that implements some
>>> functions (processRelocationRef is one of those). Technically, it looks
>>> like it's using a "pImpl" pattern, but the basic principle is the same
>>> either way - generic code handles the generic case, a derived class that
>>> understands how to deal with the specifics is used to actually perform
>>> relocations in that particular case.
>>>
>>> Exception handling is also target-specific, so in x86-64 and i386, how
>>> exception information is stored and used is different (I don't know the
>>> exact details in this case as COFF is the file-format used on Windows, and
>>> it's been at least 8 or 10 years since I did any programming at all on a
>>> Windows machine, I know that i386 on Linux uses an exception table, and
>>> x86-64 on linux essentially has debug information [DWARF tables]). The
>>> exception information is used to determine how to unwind the stack and
>>> destroy objects on the way back to the "catch" for that particular
>>> exception. There is code required both to load the exception tables into
>>> memory, and to interpret/use those tables - but I'm not overly familiar
>>> with how that works for JIT'd code. [Actually, looking at the code for
>>> x86-64, it looks like it's mainly SEH (Structured Exception Handling) that
>>> is dealt with - the overall concept still applies, but SEH is a Windows
>>> concept for handling exceptions, which includes hardware exceptions such as
>>> integer division by zero and memory access exceptions - regular C++
>>> exceptions are dealt with separately, and that is what uses what I
>>> described for Linux earlier in this paragraph].
>>>
>>> As to WHY different architectures use different relocations and
>>> exception handling tables, that's an ABI design issue - a convention that
>>> is based on the needs and requirements for each architecture, and a bunch
>>> of compromises between simplicity (a very simple table is easy to
>>> construct), space (simple table takes up more space than a more complex
>>> table construction - like a zip file or a text file - the zip file is more
>>> complicated to read, but takes up a lot less space) and code complexity
>>> (save space in table, more complex code most likely). Either way, for a
>>> given platform (OS, Processor, file format), there is a given ABI for
>>> handling exceptions. The loader needs to load the table in the correct way
>>> into the correct part of memory, and when an exception is thrown, the
>>> table(s) need to be understood and acted upon to find the way back to the
>>> relevant place where the exception is caught.
>>>
>>> The fact that the classes are declared in different files is similar to
>>> my simple animal example, where you'd have a animal.h for the base class, a
>>> cat.h, dog.h and fish.h for the actual implementations. Obviously, the
>>> specific implementations for the RuntimeDyld belongs in "Target" because
>>> they are dependent on the actual target (which is the combination of
>>> fileformat, OS and processor architecture).
>>>
>>> --
>>> Mats
>>>
>>>
>>> On 20 April 2017 at 14:04, Siddharth Shankar Swain <
>>> h2015096 at pilani.bits-pilani.ac.in> wrote:
>>>
>>>> So RuntimeDyldELF.cpp or RuntimeDyldCOFF.cpp or RuntimeDyldMachO.cpp
>>>>  are doing relocation and linking for specific object file format and all
>>>> architectures using that object file format. Am i correct? If that is so
>>>> then these  .cpp files are not using any header file in Targets/ so what
>>>> are these header files in Targets/ made for ? Another thing is that why
>>>> these header files in Targets/ handling exception and that too using a
>>>> different concept of exception frames and exception tables. Please guide
>>>> Thanks,
>>>> Siddharth
>>>>
>>>> On Thu, Apr 20, 2017 at 6:06 PM, mats petersson <mats at planetcatfish.com
>>>> > wrote:
>>>>
>>>>> Basic Object Oriented design uses a derived class to implement a
>>>>> functionality of the generic case. It's the same basic principle as:
>>>>>
>>>>> class Animal
>>>>> {
>>>>>     void virtual Say() = 0;
>>>>> };
>>>>>
>>>>> class Cat: public Animal
>>>>> {
>>>>>     void Say() override { cout << "Meow!" << endl; }
>>>>> }
>>>>>
>>>>> class Dog: public Animal
>>>>> {
>>>>>     void Say() override { cout << "Woof!" << endl; }
>>>>> }
>>>>>
>>>>> class Fish: public Animal
>>>>> {
>>>>>     void Say() override { cout << "Blub!" << endl; }
>>>>> }
>>>>>
>>>>> In this case, different types of COFF-architectures use different
>>>>> relocation entries, and based on the architecture, a specific
>>>>> implementation of the RelocationDyldCOFF class is created to perform the
>>>>> relocation.
>>>>>
>>>>> See http://llvm.org/docs/doxygen/html/classllvm_1_1RuntimeDyldCO
>>>>> FF.html for a class diagram of how this is done.
>>>>>
>>>>> The generic code in RuntimeDyld*.cpp only knows that relocations
>>>>> exists, and that they need to be dealt with. Not HOW to actually perform
>>>>> the relocation - just like "Animal" doesn't know what a cat or a dog
>>>>> "says". The processRelocationRef() is called here:
>>>>> http://llvm.org/docs/doxygen/html/RuntimeDyld_8cpp_source.html#l00251
>>>>>
>>>>> Again, it's not clear exactly what you are asking for, so I'm not sure
>>>>> whether my explanation is helpful or not...
>>>>>
>>>>> --
>>>>> Mats
>>>>>
>>>>>
>>>>> On 20 April 2017 at 12:05, Siddharth Shankar Swain <
>>>>> h2015096 at pilani.bits-pilani.ac.in> wrote:
>>>>>
>>>>>> Thanks for the reply. I was just asking about in general whatever
>>>>>> header files are there in Targets/ for different architectures are not
>>>>>> including any function except this processRelocationRef() to be used in
>>>>>> RuntimeDyldELF.cpp or RuntimeDyldCOFF.cpp or RuntimeDyldMachO.cpp and i
>>>>>> think these files are the ones which are actually doing the relocation and
>>>>>> linking work. So what purpose do these header files inside Targets/
>>>>>> actually serve. Also they include exception handling in form of exception
>>>>>> frames, So can u guide on this issue ?
>>>>>>
>>>>>> Thanks,
>>>>>> Siddharth
>>>>>>
>>>>>> On Thu, Apr 20, 2017 at 4:02 PM, mats petersson <
>>>>>> mats at planetcatfish.com> wrote:
>>>>>>
>>>>>>> The x86_64 and i386 architectures have different actual relocation
>>>>>>> records. So if you build code for i386, you need one processRelocationRef()
>>>>>>> function (handling the relevant relocations in that model), and when
>>>>>>> producing code for x86_64, there are different relocation records. The two
>>>>>>> files contain the derived form of the class that processes the relocation
>>>>>>> records when dynamically loading JITed code in LLVM - mainly implementing
>>>>>>> the two different forms of symbol entries that refer to the relocations -
>>>>>>> i386 uses COFF::IMAGE_REL_I386_*, in x86_64 the relocation types are
>>>>>>> COFF::IMAGE_REL_AMD64_*.
>>>>>>>
>>>>>>> Conceptually, they do the same thing, it's the details of exactly
>>>>>>> how and where the relocation ends up and how it's recorded by the linker
>>>>>>> that differs.
>>>>>>>
>>>>>>> Theoretically, one could probably construct a loadable file that
>>>>>>> doesn't care what architecture it is for, but it would end up with a lot of
>>>>>>> redundant & overlapping functionality, and the code to handle every
>>>>>>> different architecture in one huge switch-statement would be rather complex
>>>>>>> (and long!). So splitting the functionality per architecture helps make the
>>>>>>> code clear.
>>>>>>>
>>>>>>> If you need further help to understand the code, you'll probably
>>>>>>> need to ask a more concrete question, as it is probably not possible to
>>>>>>> describe all the relevant information on this subject in less than 200
>>>>>>> pages, never mind a simple email-thread.
>>>>>>>
>>>>>>> --
>>>>>>> Mats
>>>>>>>
>>>>>>> On 20 April 2017 at 11:13, Siddharth Shankar Swain via llvm-dev <
>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> Can anyone explain in lib/ExecutionEngine/RuntimeDyld/Targets/ the
>>>>>>>> header files included for different architectures like
>>>>>>>> RuntimeDyldCOFFX86_64.h or RuntimeDyldCOFFI386.h etc, what is the
>>>>>>>> connection of these files for relocation and linking as the linking and
>>>>>>>> relocation for diff architecture is done in RuntimeDyldELF.cpp,
>>>>>>>> RuntimeDyldCOFF.cpp  and it doesn't use any function from these header file
>>>>>>>> except the processRelocationRef(). The header files in Targets/ also
>>>>>>>> handles exceptions, so what is the need for that in relocation and linking
>>>>>>>> process ? Also please help with what this processRelocationRef() actually
>>>>>>>> does ? . Please guide.
>>>>>>>>
>>>>>>>> sincerely,
>>>>>>>> Siddharth
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170421/1b74be68/attachment.html>


More information about the llvm-dev mailing list