[llvm-commits] Speeding up RegAllocLinearScan on big test-cases

Roman Levenstein romix.llvm at googlemail.com
Tue May 6 06:13:57 PDT 2008


Hi Evan,

2008/5/5 Evan Cheng <evan.cheng at apple.com>:
> Hi Roman,
>
>  Thanks for working on this. I think the idea is workable.
>  But I worry about your fixation on std::set. It can't be healthy?! :-)

Believe me, I really tried to resist ;-) But std::set based solution is just
faster and also looks cleaner. In the sorted vector solution, I had to add
the code to handle the addition of new live intervals created by spilling -
I had to resize the corresponding vectors sorted in the order of intervals
ends and add new elements there. This made code slower and less
understandable.

But one thing about std::set that could be eventually interesting at many places
is the following:
 - in many situations we know the maximal size of the set in advance.
For example,
   in this patch, the set can contain at most all live intervals. In
the scheduler,
   the availableQueue can contain at most all SUnits. It means that if we would
   be able to allocate the memory for the maximum possible number of elements
   in advance, then there is no need for any additional memory allocation.

- I think, a custom STL allocator could be written that could do
exactly this. It would
  reserve memory for the maximum number of elements (of the equal size?)and
  maintain a free list of cells. Then we can have a very efficient
allocation and sets
  that do no produce to much malloc/free pressure. The same idea can be used
  also for some other STL containers.

  What do you think?


>
>  +    /// IntrvalEndsComparator - Weak Ordering operator for ordering
>  +    /// intervals in increasing order according to their endNuber
>  value.
>  +    struct IntrvalEndsComparator {
>  +        bool operator() (const LiveInterval *L, const LiveInterval
>  *R) const {
>  +            if (L->endNumber() < R->endNumber())
>  +              return true;
>  +            if (L->endNumber() == R->endNumber())
>  +              return L->beginNumber() < R->beginNumber();
>  +
>  +            return false;
>  +        }
>  +    };
>
>  Please fix the inconsistent indentation.

Done.

>  +  iBeginNumber.addRange(LiveRange(cur->beginNumber()-1, cur-
>   >beginNumber(), NULL));
>
>  Please avoid 80 col. violation.

Done.

>  Have you run through at least MultiSource?

Yes. No regressions, as far as I can see.

> Can you tell what impact it has on compile time of medium sized apps like kimwitu++?

There is virtually no impact on such apps. I tried with kimwitu++ and burg.
My measurements show a deviation of about 0.5%-1%in both directions.
I think this small impact on usual apps  is quite understandable. We
only add elements to the new set, when
we add elements to the 'handled' vector. Therefore, we can add at most
all existing live intervals and only once, unless we do
some backtracking. And backtracking happens not that often. And when
it happens, my approach
seems to win, especially when the number of 'handled' live intervals is big.

Updated version of the patch is attached.

-Roman

>  On May 5, 2008, at 7:52 AM, Roman Levenstein wrote:
>
>  > Hi,
>  >
>  > I have found out that RegAllocLinearScan is very inefficient on very
>  > big test-cases, where there are thousands of live intervals.
>  > Even in the release build (which is optimized), the function
>  > assignRegOrStackSlotAtInterval may consume up to 60-90% of the total
>  > compilation time according to the profiler.
>  >
>  > The reason for this bad performance is the following snippet of code
>  > in that function:
>  >
>  >  for (unsigned i = 0, e = handled_.size(); i != e; ++i) {
>  >    LiveInterval *HI = handled_[i];
>  >    if (!HI->expiredAt(earliestStart) &&
>  >        HI->expiredAt(cur->beginNumber())) {
>  >      DOUT << "\t\t\tundo changes for: " << *HI << '\n';
>  >      active_.push_back(std::make_pair(HI, HI->begin()));
>  >      assert(!TargetRegisterInfo::isPhysicalRegister(HI->reg));
>  >      prt_->addRegUse(vrm_->getPhys(HI->reg));
>  >    }
>  >  }
>  >
>  > It looks quite innocent, but imagine that handled_ vector contains
>  > thousands of elements and that we do backtracking rather often, which
>  > is
>  > the case if you have a lot of overlapping live intervals.
>  >
>  > Therefore, this loop should be implemented more efficiently.
>  > Basically, we need to find intervals whose endNumber() is between
>  > earliestStart and cur->beginNumber().
>  > The easiest way to do that is to have a dedicated data structure where
>  > all intervals are sorted in the increasing order if their endNumber().
>  >
>  > This is exactly what I do with this patch. Initially, I tried with the
>  > std::set approach and it worked. But since I know how much you both
>  > like the std::set :-),
>  > I decided to implement something similar that does not produce so many
>  > dynamic memory allocations. So, I used vectors and used std::sort to
>  > keep them sorted. But that non std::set based approach was
>  > significantly slower than std::set-based one. Therefore, I decided to
>  > go with the std::set-based approach.
>  >
>  > With this patch, the bottleneck is completely removed and my profiler
>  > does not even show this function as time-consuming (i.e. it is more
>  > than an order of magnitude speed-up). On small use-cases the impact is
>  > virtually invisible, since the number of elements in the handled_ set
>  > is rather small.
>  >
>  > All dejagnu tests pass without problems.
>  >
>  > Please review it and tell if it is OK for committing.
>  >
>  > -Roman
>  > <
>  > RegAllocLinearScan
>  > .cpp.patch>_______________________________________________
>  > llvm-commits mailing list
>  > llvm-commits at cs.uiuc.edu
>  > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>  _______________________________________________
>  llvm-commits mailing list
>  llvm-commits at cs.uiuc.edu
>  http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RegAllocLinearScan.cpp.patch1
Type: application/octet-stream
Size: 4222 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20080506/5e9d7171/attachment.obj>


More information about the llvm-commits mailing list