[LLVMdev] [RFC] CodeGen Context

Bill Wendling isanbard at gmail.com
Sat Oct 12 01:55:00 PDT 2013


Hi all,

This is my proposal for how to solve the problem we have with function attributes that affect code generation changing between functions. (This is mostly a problem for LTO.)

Please take a look at this proposal, and let me know if you have any questions or comments.

Cheers!

-bw


                           CodeGen Context
                           ===============

The back-end's objects are currently generated once with a set of options handed
to it by the front-end. These options are not expected to change throughout the
lifetime of the back-end. With the advent of extended function attributes, this
is no longer a correct assumption. During LTO for instance, a function's
attributes may change how the back-end should generate code for that function.
For example, in this code `@foo' won't disable frame pointer generation, but
`@bar' will disable it:

 define void @foo() "no-frame-pointer-elim"="true"  { ret void }
 define void @bar() "no-frame-pointer-elim"="false" { ret void }

Of course, this is a very simple example. Other options affect the construction
of the back-end objects themselves (e.g., `use-soft-float').

--------------------------------------------------------------------------------

Before we get further, here are a few definitions used in this document:

Back-end Objects ::

Objects that affect code generation --- e.g., TargetInstrInfo,
TargetFrameLowering, DataLayout, etc.

CGContext ::

A central repository for back-end objects. The back-end objects may change, so
they should not be "cached" by individual passes. This is analogous to the
current TargetMachine object. The term "CGContext" is used because it
separates the current implementation from the "ideal" implementation.

Important Options ::

Those options which affect back-end object construction.

--------------------------------------------------------------------------------

So, the back-end has to be prepared for "important options" to change. The ideal
solution would be for the back-end to query the CGContext any time it needs
information on how to generate code.  Unfortunately, this isn't currently
feasible, because of how back-end objects are constructed, though it is
something worth striving for. As such, there are four goals we want to achieve:

1. As many options as possible should be queried via the back-end directly
  rather than relying upon objects holding onto these options,

2. Those which affect how objects are generated require those objects to be
  regenerated when the important options change,

3. There is no more dependence upon IR-level code. I.e., the back-end would
  still function if the IR code were deleted, and

4. Not prevent the back-end from being parallelized.

Some things to note:

* Recreating the back-end for each changing set of important options is
 expensive. A simple test showed that there is a measurable slowdown in the
 worst-case scenario where the back-end is recreated for every function.

* Object creation in the back-end has a high order of coupling. I.e., one
 object creates another object, which uses the original object, and may
 create other objects dependent upon previous objects, etc.

* Most functions should have the same set of important options, thus reducing
 the need to regenerate the back-end objects for each function.

* Some objects are created on demand, and may change during code generation.

This is a simple model of how command line options and function attributes will
be pass through the compiler from the front-end to the middle-end and finally
the back-end:

The front-end generates the functions with appropriate function attributes taken
from command line options. Because the front-end may be dealing with IR files
and the command line options that are currently used may be different from those
the function was generated with, the front-end will create an "OptionContext"
object. Options specified by function attributes may be overridden by options
specified in the OptionContext. These are used as IR options by the middle
end. A suitable API will be set up to make this transparent to the middle end
*waves hands wildly*.

The function attributes and options context are used to generate the CGContext.
All IR passes, that need to know about target data, and code-generation passes
will query the CGContext for all information needed to construct the back-end.
When important options change (based on a new function's attributes), the
context can transparently reconstruct the objects that are affected. To minimize
time spent recreating the back-end objects, they can be cached.

Have some ASCII art:

            ,---------------.
        ::  | OptionContext | --.
        |   `---------------'   |   ,------------.
Front End |                       |-->| IR Options |   :: Middle End
        |  ,----------------.   |   `------------'
        :: | Function Attrs |---+-.
           `----------------'     |    ,-----------.
                                  `--> | CGContext | :: Back End
                                       `-----------'

The CGContext will transparently recreate any objects it needs to. This means
that back-end code won't be able to cache any of the objects the CGContext
creates (this has already been addressed).

The CGContext can be reached through the MachineFunction object:

 CGContext &context = MF->getContext();
 const TargetFrameLowering *TFL = context->getFrameLowering();

 if (TFL->getStackGrowthDirection() == TargetFrameLowering::StackGrowsUp) {
   // ...
 }

Currently, the best place to process the function attributes is towards the
beginning of the `SelectionDAGISel::runOnMachineFunction()' method. This has one
side-effect --- the CGContext may not be available to IR passes which use
it. This will need to be addressed on a case-by-case basis. One option is to
have the pass manager populate the CGContext at the point in the pipeline where
we begin lowering.



More information about the llvm-dev mailing list