[llvm-dev] RFC: Trace-based layout.
Kyle Butt via llvm-dev
llvm-dev at lists.llvm.org
Thu Sep 14 18:53:07 PDT 2017
I plan on rewriting the block placement algorithm to proceed by traces.
A trace is a chain of blocks where each block in the chain may fall through
the successor in the chain.
The overall algorithm would be to first produce traces for a function, and
order those traces to try and get cache locality.
Currently block placement uses a greedy single step approach to layout. It
produces chains working from inner to outer loops. Unlike a trace, a chain
contain non-fallthrough edges. This causes problems with loop layout. The
problems with loop layout are: loop rotation and cold blocks in a loop.
Overview of proposed solution:
Greedily produce a set of traces through the function. A trace is a list of
blocks with each block in the list falling through (possibly conditionally)
the next block in the list. Loop rotation will occur naturally in this
the triangle replacement algorithm below. Handling single trace loops
tweak, see the detailed design.
After producing what we believe are the best traces, they need to be
They will be ordered topologically, except that traces that are cold enough
measured by their warmest block) will be floated later, This may push them
of a loop or to the end of the function.
Note whenever an edge is used as a number, I am referring to the edge
Phase 1: Producing traces
Traces are produced according to the following algorithm:
* Sort the edges according to weight, stable-sorting them according the
block and edge ordering.
* Place each block in a trace of length 1.
* For each edge in order:
* If the source is at the end of a trace, and the target is at the
of a trace, glue those 2 traces into 1 longer trace.
* If an edge has a target or source in the middle of another trace,
tail duplication. The benefit calculation is the same as the existing
* If an edge has a source or target in the middle, check them to see if
can be replaced as a triangle. (Triangle replacement described below)
* Compare the benefit of choosing the edge, along with any triangles
found, with the cost of breaking the existing edges.
* If it is a net benefit, perform the switch.
* Triangle checking:
Consider a trace in 2 parts: A1->A2, and the current edge under
is A1->B (the case for C->A2 is mirror, and both may need to be done)
* First find the best alternative C->B
* Check for an alternative for A2: D->A2
* Find D's best Alternative: D->E
* Compare the frequencies: A1->A2 + C->B + D->E vs A1->B + D->A2
* If the 2nd sum is bigger, do the switch.
* Loop Rotation Tweak:
If A contains a backedge A2->A1, then when considering A1->B or C->A2,
can include that backedge in the gain:
A1->A2 + C->D + E->B vs A1->B + C->A2 + A2->A
Phase 2: Order traces.
First we compute the frequency of a trace by finding the max frequency of
Then we attempt to place the traces topologically. When a trace cannot be
topologically, we prefer warmer traces first.
Questions and comments welcome.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev