[LLVMdev] Adding support to LLVM for data & code layout (needed by GHC)

Tue Jun 15 15:08:57 PDT 2010

Subsections is a very good idea. You can even do without
post-processing by using carefully crafted section names, e.g.

__attribute__((section(".text,\"ax\", at progbits\n\t.subsection 1 #")))
void foo()
{
}

(Note that you need ".subsection n" commands on ELF targets and
".section name, n" commands on COFF targets; seems that the latter was
supported on all targets in old versions of gas, but not any longer).

Eugene

On Tue, Jun 15, 2010 at 2:18 PM, David Terei <davidterei at gmail.com> wrote:
> Hi all,
>
> Just wanted to report that I've found a second way to achieve
> data/code layout (the first being the linker script that Eugene
> mentioned).
>
> The key is that gnu as supports a feature called subsections.
>
> http://sourceware.org/binutils/docs-2.20/as/Sub_002dSections.html#Sub_002dSections
>
> The way this works is that you can put stuff into a section like
> '.text 2', where 2 is a subsection of .text When run, 'as' orders the
> subsections. So all you need to do is arrange for the sidetable to be
> in section '.text n' and the code in section '.text n+1'. Each
> sidetable and its code goes in its own subsection. The nice thing is,
> this is purely a gnu as feature. When it compiles the assembly to
> object code, the subsections aren't present in the object code, so you
> don't get 100's of sections that take up space and slow down linking.
>
> There is one complication though. LLVM (and GCC as well) don't support
> subsections. While you can define what section globals and functions
> are in, this doesn't support defining the subsection. If you say to
> LLVM, put function f in section "text 12", it produces assembly like:
>
> .section text 12,"rw" @progbits
> f:
>  [..]
>
> Which causes gas to spit out a syntax error. Gas only allows using
> subsections through a very defined syntax, so it needs to be:
>
> .text 12
> f:
>  [...]
>
> We can convert between them though with just a simple regex.
>
> We are going to use this approach for the moment in GHC, we've tested
> it and its working great so far. I prefer this method over the linker
> script as implementing the linker script approach would affect all the
> backends GHC supports while this approach is contained to the LLVM
> backend.
>
> I'm still planning on adding support to LLVM for supporting side
> tables in some manner so we can just depend on pure LLVM.
>
> Cheers,
> David
>
> On 10 June 2010 18:08, Andrew Lenharth <andrewl at lenharth.org> wrote:
>> On Thu, Jun 10, 2010 at 11:34 AM, David Terei <davidterei at gmail.com> wrote:
>>> Its good to see that a feature of this nature would be useful to a
>>> whole range of people, I wasn't aware of that.
>>>
>>> On 9 June 2010 22:40, Andrew Lenharth <andrewl at lenharth.org> wrote:
>>>> My argument amounts to express side tables as side tables in the IR
>>>> rather than as an ordering on globals.  I think that would simplify
>>>> the backend (a side table is something you discover form the function
>>>> rather than having to check another global).  Also, if well specified,
>>>> I think you could allow basic block labels into structures which makes
>>>> them more interesting for other uses.
>>>
>>> Sure. I wasn't set on the third approach I suggested, which is to have
>>> them expressed as side tables in the IR as I didn't realise other
>>> users would be interested in them so I didn't think it would be
>>> appropriate to add new language constructs for one user. I don't think
>>> it would simpler to implement in the backend though and this approach
>>> would need changes to the frontend, so a lot more work.
>>
>> The backend already can sort of do this with the GCMetadataPrinter.
>> Generalizing that to arbitrary side tables might be easier than adding
>> a new construct (granted sidetables might not replace the ability to
>> output assembly by that class, but they might do a lot of the heavy
>> lifting).  Since GC lowering happens on the IR level (from the docs I
>> looked at, I haven't personally dealt with GC yet), it maybe possible
>> to do a lot of lowering to generalized tables rather than complex
>> GCMetadataPrinter implementations.  This is just speculation on my
>> part though.  This is one of the reasons I thought labels in the
>> constant structs could be handy.  Perhaps a general side table
>> representation in the backend could be used by EH too?
>>
>> Andrew
>>
>>> What I am hoping someone may be able to give a answer to though is
>>> what issues there may be if the second approach was taken (using the
>>> special glob var)? Would the optimiser be tempted at some point to
>>> replace a load instruction to an unknown address created by a negative
>>> offset from a function with unreachable for example as Eugene
>>> suggested may be possible?
>>>
>>> Also, what are you gaining going with the third approach? I guess the
>>> optimiser could do things like constant propogation using the third
>>> approach but not the second although I think thats unlikely do give
>>> much benefit in the kind of code GHC produces but there is everyone
>>> else to think of :).
>>>
>>> Thanks for all the responses though, I'm going to start playing around
>>> with some code and see what happens.
>>>
>>
>