[llvm-commits] [llvm] r117174 - in /llvm/trunk: include/llvm/CodeGen/LinkAllCodegenComponents.h include/llvm/CodeGen/Passes.h lib/CodeGen/CMakeLists.txt lib/CodeGen/LiveIntervalUnion.cpp lib/CodeGen/LiveIntervalUnion.h lib/CodeGen/RegAllocBase.h lib/CodeGen/RegAllocBasic.cpp lib/CodeGen/SplitKit.h

Tue Oct 26 13:44:19 PDT 2010

On Oct 26, 2010, at 1:03 PM, Evan Cheng wrote:

> Thanks Andy. It looks like a great step. Some questions and comments below:
> 
> 1. How to write a new allocator based on RegAllocBase? Is it as simple as providing a allocation queue priority function and selectOrSplit? How about routines to undo an allocation?

RegAllocBasic is a minimal implementation of RegAllocBase. Maybe it should be called RegAllocMinimal? I could split the .cpp file to make this more clear.

The only thing a new allocator must do is override selectOrSplit(). You can see from RABasic::selectOrSplit, that not much needs to be implemented to have a working allocator.

We do need a way to undo an allocation for functional completeness. (RABasic currently passes most but not all benchmarks.) My next step will be to implement LiveIntervalUnion::extract and call it in the uncommon case that we can't find a register for spills.

> 2. Does a allocator need to provide its own allocatePhysRegs? Should RegAllocBase provide a base implementation?

No. allocatePhysRegs never changes. Any allocator can be plugged in as along it it can solve the global allocation by adhering to two constraints:

1. It must be incremental vs. iterative (Think linear scan vs. graph coloring.) 

The underlying assumption is that we allocate or split one virtual register at a time until the problem is solved. We do not partially solve the problem, mutate the IR, throw away the solution and retry.

2. It must be able to express the desired order of allocation as a static priority.

Currently we assume spill weight can capture the priority. A new allocator can override priority by changing spill weight calculation without otherwise notifying RegAllocBase.

I imagine linear scan could be reimplemented with this framework. However silly and pointless that would be.

> 3. It would be nice if the new allocators do not have to update LiveStacks but instead compute it after the allocation is done. Do you think it's possible?

I'm not planning to update LiveStacks in regalloc because Jakob plans to add it to the spiller.

In general, an allocator implementation will have the choice of updating or recomputing it. Beyond that, I haven't thought much about the tradeoffs.

> 4. What's your plan for more efficient LiveSegments implementation?

I'll let Jakob answer that. I haven't thought of anything better than a B-btree. 

> 5. What's next? :-)

- RABasic handling trivial register reassignment
- RAGreedy implementation (adding simple reassignment and splitting heuristics)
- Evaluation of compile-time vs. code quality tradeoffs (what knobs should we give the user?)

-Andy

> On Oct 26, 2010, at 11:37 AM, Andrew Trick wrote:
> 
>> On Oct 25, 2010, at 2:58 PM, Jakob Stoklund Olesen wrote:
>>> Please try to limit the header file dependencies.
>>> 
>>> - When passing a SmallVector by reference, use SmallVectorImpl, see Spiller.h:
>>> 
>>> virtual unsigned selectOrSplit(LiveInterval &lvr,
>>>                              SmallVectorImpl<LiveInterval*> &splitLVRs) = 0;
>>> 
>>> - The rest of the unnecessary headers are pulled in by the premature abstraction of making allocatePhysRegs templated on the comparator.
>>> 
>>> Please drop the template, it will clean up the code significantly when you move the implementation into the .cpp file.
>>> 
>>> When/if we need the abstraction, we can use virtual functions instead:
>>> 
>>> struct PriQueue {
>>>   virtual void seed(LiveIntervals&) =0;
>>>   virtual LiveInterval *get() =0;
>>> };
>>> 
>>> Since we will be doing significant work on each register, avoiding a virtual call is not worth the template ick.
>> 
>> Agreed, and done.
>> 
>>>> +#include "llvm/CodeGen/LiveInterval.h"
>>> 
>>> Do you need this header? It looks like SlotIndexes.h is enough.
>> 
>> We could do that by moving certain inline functions, like overlap() into LiveIntervalUnion.cpp. But it only ever makes sense for LiveIntervalUnion users to include LiveInterval, so is it worth scattering the methods?
>> 
>>>> +struct LiveSegment {
>>>> +  bool operator<(const LiveSegment &ls) const {
>>>> +    return start < ls.start || (start == ls.start && end < ls.end);
>>>> +  }
>>> 
>>> I know that struct LiveRange does it too, but why the lexicographical compare? We are only dealing with disjoint segments, right?
>>> 
>>> This operator will be quite performance sensitive.
>> 
>> I originally implemented without making the disjoint assumption unless we needed to. But in hindsight I totally agree.
>> 
>>>> +/// Compare a live virtual register segment to a LiveIntervalUnion segment.
>>>> +inline bool overlap(const LiveRange &lvrSeg, const LiveSegment &liuSeg) {
>>>> +  return lvrSeg.start < liuSeg.end && liuSeg.start < lvrSeg.end;
>>>> +}
>>> 
>>> Please add a note that LiveSegment represents a half-open interval. I am actually planning on switching to closed intervals - it is more natural for callers since the low bits of SlotIndex are symbolic.
>> 
>> OK
>> 
>>>> +class LiveIntervalUnion {
>>>> +public:
>>>> +  // default ctor avoids placement new
>>>> +  LiveIntervalUnion() : repReg_(0) {}
>>>> +  
>>>> +  void init(unsigned repReg) { repReg_ = repReg; }
>>>> +
>>>> +  SegmentIter begin() { return segments_.begin(); }
>>>> +  SegmentIter end() { return segments_.end(); }
>>>> +
>>>> +  /// FIXME: !!!!!!!!!!! Keeps a non-const ref
>>>> +  void unify(LiveInterval &lvr);
>>>> +
>>>> +  // FIXME: needed by RegAllocGreedy
>>>> +  //void extract(const LiveInterval &li);
>>> 
>>> Please add comments to public functions.
>>> 
>>> I feel your pain on storing non-const references, but just add a comment promising not to change anything.
>> 
>> Oops. Meant to do that.
>> 
>>>> Added: llvm/trunk/lib/CodeGen/LiveIntervalUnion.cpp
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/LiveIntervalUnion.cpp?rev=117174&view=auto
>>>> ==============================================================================
>>>> --- llvm/trunk/lib/CodeGen/LiveIntervalUnion.cpp (added)
>>>> +++ llvm/trunk/lib/CodeGen/LiveIntervalUnion.cpp Fri Oct 22 18:09:15 2010
>>>> @@ -0,0 +1,167 @@
>>>> +// Merge a LiveInterval's segments. Guarantee no overlaps.
>>>> +void LiveIntervalUnion::unify(LiveInterval &lvr) {
>>>> +  // Add this live virtual register to the union
>>>> +  LiveVirtRegs::iterator pos = std::upper_bound(lvrs_.begin(), lvrs_.end(),
>>>> +                                                &lvr, less_ptr<LiveInterval>());
>>>> +  assert(pos == lvrs_.end() || *pos != &lvr && "duplicate LVR insertion");
>>> 
>>> You should be even stricter here, and assert no overlaps.
>> 
>> Yes, meant to do it. Please rereview that code now.
>> 
>>>> +  lvrs_.insert(pos, &lvr);
>>>> +  // Insert each of the virtual register's live segments into the map
>>>> +  SegmentIter segPos = segments_.begin();
>>>> +  for (LiveInterval::iterator lvrI = lvr.begin(), lvrEnd = lvr.end();
>>>> +       lvrI != lvrEnd; ++lvrI ) {
>>>> +    LiveSegment segment(lvrI->start, lvrI->end, lvr);
>>>> +    segPos = segments_.insert(segPos, segment);
>>>> +    assert(*segPos == segment && "need equal val for equal key");
>>>> +  }
>>> 
>>> You could save some space by coalescing touching segments here, but that can wait until we get a 'real' data structure.
>> 
>> Comment added. Of course, LVR extraction has to handle it too then.
>> 
>>>> +typedef LiveIntervalUnion::SegmentIter SegmentIter;
>>>> +SegmentIter upperBound(SegmentIter segBegin,
>>>> +                       SegmentIter segEnd,
>>>> +                       const LiveRange &lvrSeg)
>>> 
>>> Please use anonymous namespaces only for types. Functions should just be declared static.
>> 
>> Yes. Forgot that rule.
>> 
>>>> Added: llvm/trunk/lib/CodeGen/RegAllocBasic.cpp
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/RegAllocBasic.cpp?rev=117174&view=auto
>>>> ==============================================================================
>>>> --- llvm/trunk/lib/CodeGen/RegAllocBasic.cpp (added)
>>>> +++ llvm/trunk/lib/CodeGen/RegAllocBasic.cpp Fri Oct 22 18:09:15 2010
>>>> @@ -0,0 +1,259 @@
>>> 
>>>> +class RABasic : public MachineFunctionPass, public RegAllocBase
>>> 
>>> Instead of using multiple inheritance, would it make sense for RegAllocBase to be a MachineFunctionPass? It seems that any register allocator using RegAllocBase would need the same pass initialization code and so on.
>> 
>> I was also tempted to make RegAlloc an abstract MachineFunctionPass. In fact I originally did that, and it was a bit easier write the code. But I want to give you some time to reconsider, because I don't think that design follows what you originally asked for: an implementation of the basic allocation pass that would never change and serve as a clean reference point and boilerplate code for extensions. It will be easier for others to understand the design and how to extend it if we keep RegAllocBase pure so that it only handles logic fundamental to the register allocation data structures and algorithms. Each regalloc subclass is a different type of pass, so it is responsible for all pass-related implementation. If RegAllocBase is derived from MachineFunctionPass then it will implement some of the pass setup and dependencies and the subclasses will each have to figure out what's left over and add any new dependencies--I think that's bad design and leads to scattered implement!
>> ation.
>> 
>> I would not violate good design principles just to avoid multiple inheritance. I think that's a separate issue. Multiple inheritance was just an easy shortcut to avoid defining another class (e.g. RegAllocBasicImpl + RegAllocBasicPass), but I would be happy to do that if you think it's worthwhile.
>> 
>>>> +void RABasic::getAnalysisUsage(AnalysisUsage &au) const {
>>>> +  au.setPreservesCFG();
>>>> +  au.addRequired<LiveIntervals>();
>>>> +  au.addPreserved<SlotIndexes>();
>>>> +  if (StrongPHIElim)
>>>> +    au.addRequiredID(StrongPHIEliminationID);
>>>> +  au.addRequiredTransitive<RegisterCoalescer>();
>>>> +  au.addRequired<CalculateSpillWeights>();
>>>> +  au.addRequired<LiveStacks>();
>>>> +  au.addPreserved<LiveStacks>();
>>>> +  au.addRequired<MachineLoopInfo>();
>>>> +  au.addPreserved<MachineLoopInfo>();
>>>> +  au.addRequired<VirtRegMap>();
>>>> +  au.addPreserved<VirtRegMap>();
>>>> +  DEBUG(au.addRequired<RenderMachineFunction>());
>>>> +  MachineFunctionPass::getAnalysisUsage(au);
>>>> +}
>>> 
>>> This would probably be the same for all register allocators using RegAllocBase.
>> 
>> I think the highest priority is to list all dependencies for a pass in one place for readability. I'm assuming that the common dependencies will hardly change, and if they do it's trivial to correct all passes, but that each pass (subclass) will add dependencies. This may be a rare case in which it's actually desirable to copy the common code, rather than scatter the implementation. If it's really important to factor pass setup, then the same reasoning should apply to all register allocators.
>> 
>>>> +void RABasic::releaseMemory() {
>>>> +  spiller_.reset(0);
>>>> +  RegAllocBase::releaseMemory();
>>>> +}
>>> 
>>> Since register allocation is not an analysis used by other passes, you might as well release all memory at the bottom of runOnMachineFunction().
>> 
>> Yes.
>> 
>> -Andy
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20101026/0d595b6f/attachment.html>