[LLVMugs][Bug 20122] Patch ready

Tue Jul 8 08:51:12 PDT 2014

Andy, Nadav, Tom, could you please look at this. With this patch (which I think is functionally correct), the register pressure sets for X86 and AMDGPU change. I want to make sure these changes look reasonable...

For X86, the change is that the GR8+GR64_TC pressure set becomes the GR64_NOREX+GR8+GR64_TC pressure set.

For AMDGPU, the following pressure sets are added:
R600_Reg64+R600_TReg32_Y+R600_Reg64Vertical
R600_Reg64+R600_TReg32_X+R600_Reg64Vertical
R600_TReg32_Y+R600_TReg32_X+R600_Reg64
R600_Reg64+R600_TReg32_W+R600_Reg64Vertical
R600_Reg64+R600_TReg32_Z+R600_Reg64Vertical
R600_TReg32_Y+R600_TReg32_W+R600_Reg64Vertical
R600_TReg32_Z+R600_TReg32_W+R600_Reg64Vertical
R600_TReg32_Z+R600_TReg32_Y+R600_Reg64Vertical
R600_TReg32_W+R600_TReg32_X+R600_Reg64Vertical
R600_TReg32_Y+R600_TReg32_X+R600_Reg64Vertical
R600_TReg32_Z+R600_TReg32_X+R600_Reg64Vertical
R600_Reg64+R600_TReg32_Y+R600_TReg32_W+R600_Reg64Vertical
R600_Reg64+R600_TReg32_Z+R600_TReg32_Y+R600_Reg64Vertical
R600_Reg64+R600_TReg32_W+R600_TReg32_X+R600_Reg64Vertical
R600_Reg64+R600_TReg32_Z+R600_TReg32_X+R600_Reg64Vertical
R600_Reg64+R600_TReg32_Z+R600_TReg32_W+R600_Reg64Vertical
R600_TReg32_Z+R600_TReg32_Y+R600_TReg32_W+R600_Reg64Vertical
R600_TReg32_Y+R600_TReg32_W+R600_TReg32_X+R600_Reg64Vertical
R600_TReg32_Z+R600_TReg32_W+R600_TReg32_X+R600_Reg64Vertical
R600_TReg32_Z+R600_TReg32_Y+R600_TReg32_X+R600_Reg64Vertical
R600_Reg64+R600_TReg32_Z+R600_TReg32_Y+R600_TReg32_W+R600_Reg64Vertical
R600_Reg64+R600_TReg32_Z+R600_TReg32_W+R600_TReg32_X+R600_Reg64Vertical

If these don't seem bad, I'm inclined to okay this patch ;)

Thanks again,
Hal

----- Original Message -----
> From: "wuhui1973" <wuhui1973 at 163.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: llvm-commits at cs.uiuc.edu
> Sent: Friday, July 4, 2014 3:51:28 AM
> Subject: Re:Re: [LLVMugs][Bug 20122] Patch ready
> 
> 
> 
> Thanks Hal:
> 
> 
> I have updated the patch. Loadbuild & test had been run and passed.
> 
> 
> Regards Hui Wu
> 
> 
> 
> 
> At 2014-07-03 10:57:28, "Hal Finkel" <hfinkel at anl.gov> wrote:
> >----- Original Message -----
> >> From: "wuhui1973" <wuhui1973 at 163.com>
> >> To: "Hal Finkel" <hfinkel at anl.gov>
> >> Cc: llvm-commits at cs.uiuc.edu
> >> Sent: Thursday, July 3, 2014 4:42:05 AM
> >> Subject: Re: [LLVMugs][Bug 20122] Patch ready
> >> 
> >> 
> >> 
> >> Hello Hal:
> >> 
> >> 
> >> 
> >> I have caught the debug log, please see attachment. Below is the
> >> finding:
> >> 
> >> 
> >> r600 has a large register file, so the generated Unitset is large
> >> too.
> >> > Before union, there are 26 sets
> >> > After union, there are 59 sets
> >> > After prune, the number goes from 59 to 55 (means the sets are
> >> > quite diversity)
> >> 
> >> 
> >> The original alogrithm is effective, I enhence the debug log a
> >> little
> >> to output the sets filtered out, can see lots of them had been
> >> filtered.
> >> 
> >> Because the operation is find the transtive closure union of any
> >> two
> >> overlapping sets, the original operating sets are roughly 26*26
> >> for
> >> this example.
> >> So I think we should set the limit out of this number. For
> >> example,
> >> we may except in these combinations only around 10% will be kept,
> >> so
> >> using 26*26/10
> >> instead of 2*26. Please refer the patch for the solution.
> >
> >I'll look at it, thanks.
> >
> >One thing you'll need to do is add some kind of (void)limit;
> >statement so that limit is not unused in a release build (which
> >will produce a warning).
> >
> > -Hal
> >
> >> 
> >> 
> >> Regards & Thanks
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> At 2014-07-01 07:21:45, "Hal Finkel" < hfinkel at anl.gov > wrote:
> >> >----- Original Message -----
> >> >> From: "wuhui1973" < wuhui1973 at 163.com >
> >> >> To: "Hal Finkel" < hfinkel at anl.gov >
> >> >> Cc: llvm-commits at cs.uiuc.edu >> Sent: Tuesday, July 1, 2014
> >> >> 4:57:06 AM
> >> >> Subject: Re:Re: [LLVMugs][Bug 20122] Patch ready
> >> >> 
> >> >> 
> >> >> 
> >> >> 
> >> >> Hello Hal:
> >> >> 
> >> >> 
> >> >> I found this comment about the assertion:
> >> >> 
> >> >> 
> >> >> 
> >> >> // In theory, this is combinatorial. In practice, it needs to
> >> >> be
> >> >> bounded
> >> >> // by a small number of sets for regpressure to be efficient.
> >> >> // If the assert is hit, we need to implement pruning.
> >> >> assert(Idx < (2*NumRegUnitSubSets) && "runaway unit set
> >> >> inference");
> >> >> 
> >> >> Seems the author had forseen this problem, and he (she)
> >> >> suggested
> >> >> to
> >> >> prune the unitset.
> >> >> I try to prune the unitset by pruneUnitSets if it exceeds the
> >> >> limit,
> >> >> but not succeed - the set is not reduced.
> >> >> 
> >> >> 
> >> >> I am now trying to open the debug output available in the code
> >> >> and
> >> >> see what's happening.
> >> >> Just a question: How to enable DEBUG macro in this source,
> >> >> which
> >> >> is
> >> >> run by tablegen?
> >> >
> >> >If you run with the -debug command-line argument you should get
> >> >the
> >> >debugging output.
> >> >
> >> >Thanks again,
> >> >Hal
> >> >
> >> >> 
> >> >> 
> >> >> However, this error can be avoid, just using assert(Idx <
> >> >> (3*NumRegUnitSubSets) && "runaway unit set inference");
> >> >> 
> >> >> 
> >> >> Regards & Thanks At 2014-07-01 05:42:20, "Hal Finkel" <
> >> >> hfinkel at anl.gov > wrote:
> >> >> >Hi,
> >> >> >
> >> >> >With this patch applied, building LLVM's AMDGPU target fails:
> >> >> >
> >> >> >/src/llvm-trunk/utils/TableGen/CodeGenRegisters.cpp:1588: void
> >> >> >llvm::CodeGenRegBank::computeRegUnitSets(): Assertion `Idx <
> >> >> >(2*NumRegUnitSubSets) && "runaway unit set inference"' failed.
> >> >> >
> >> >> >Maybe the threshold in the assert just needs to be loosened,
> >> >> >but
> >> >> >we'd need to understand what is going on here.
> >> >> >
> >> >> > -Hal
> >> >> >
> >> >> >----- Original Message -----
> >> >> >> From: "wuhui1973" < wuhui1973 at 163.com >
> >> >> >> To: llvm-commits at cs.uiuc.edu >> Sent: Sunday, June 29, 2014
> >> >> >> 11:08:40 PM
> >> >> >> Subject: [LLVMugs][Bug 20122] Patch ready
> >> >> >> 
> >> >> >> 
> >> >> >> 
> >> >> >> 
> >> >> >> 
> >> >> >> [ reply ] [-] Description hui wu 2014-06-24 21:35:52 CDT In
> >> >> >> function
> >> >> >> CodeGenRegBank::computeRegUnitSets, below code fragment is
> >> >> >> used
> >> >> >> to
> >> >> >> calculate the transitive closure of set union of
> >> >> >> intersecting
> >> >> >> sets.
> >> >> >> 
> >> >> >>   // Iterate over all unit sets, including new ones added by
> >> >> >>   this
> >> >> >>   loop.
> >> >> >>   unsigned NumRegUnitSubSets = RegUnitSets.size();
> >> >> >>   for (unsigned Idx = 0, EndIdx = RegUnitSets.size(); Idx !=
> >> >> >>   EndIdx;
> >> >> >>   ++Idx) {
> >> >> >>     // In theory, this is combinatorial. In practice, it
> >> >> >>     needs
> >> >> >>     to
> >> >> >>     be
> >> >> >>     bounded
> >> >> >>     // by a small number of sets for regpressure to be
> >> >> >>     efficient.
> >> >> >>     // If the assert is hit, we need to implement pruning.
> >> >> >>     assert(Idx < (2*NumRegUnitSubSets) && "runaway unit set
> >> >> >>     inference");
> >> >> >> 
> >> >> >>     // Compare new sets with all original classes.
> >> >> >>     for (unsigned SearchIdx = (Idx >= NumRegUnitSubSets) ? 0
> >> >> >>     :
> >> >> >>     Idx+1;
> >> >> >>          SearchIdx != EndIdx; ++SearchIdx) {
> >> >> >>       std::set<unsigned> Intersection;
> >> >> >>       std::set_intersection(RegUnitSets[Idx].Units.begin(),
> >> >> >>                             RegUnitSets[Idx].Units.end(),
> >> >> >>                             RegUnitSets[SearchIdx].Units.begin(),
> >> >> >>                             RegUnitSets[SearchIdx].Units.end(),
> >> >> >>                             std::inserter(Intersection,
> >> >> >>                             Intersection.begin()));
> >> >> >>       if (Intersection.empty())
> >> >> >>         continue;
> >> >> >> 
> >> >> >>       // Speculatively grow the RegUnitSets to hold the new
> >> >> >>       set.
> >> >> >>       RegUnitSets.resize(RegUnitSets.size() + 1);
> >> >> >>       RegUnitSets.back().Name =
> >> >> >>         RegUnitSets[Idx].Name + "+" +
> >> >> >>         RegUnitSets[SearchIdx].Name;
> >> >> >> 
> >> >> >>       std::set_union(RegUnitSets[Idx].Units.begin(),
> >> >> >>                      RegUnitSets[Idx].Units.end(),
> >> >> >>                      RegUnitSets[SearchIdx].Units.begin(),
> >> >> >>                      RegUnitSets[SearchIdx].Units.end(),
> >> >> >>                      std::inserter(RegUnitSets.back().Units,
> >> >> >>                                    RegUnitSets.back().Units.begin()));
> >> >> >> 
> >> >> >>       // Find an existing RegUnitSet, or add the union to
> >> >> >>       the
> >> >> >>       unique
> >> >> >>       sets.
> >> >> >>       std::vector<RegUnitSet>::const_iterator SetI =
> >> >> >>         findRegUnitSet(RegUnitSets, RegUnitSets.back());
> >> >> >>       if (SetI != llvm::prior(RegUnitSets.end()))
> >> >> >>         RegUnitSets.pop_back();
> >> >> >>     }
> >> >> >>   }
> >> >> >> 
> >> >> >> However, the use of EndIdx like that is not correct, it
> >> >> >> makes
> >> >> >> the
> >> >> >> for
> >> >> >> loop exiting too early without handling the new added sets.
> >> >> >> The
> >> >> >> correct way should be:
> >> >> >> 
> >> >> >>   // Iterate over all unit sets, including new ones added by
> >> >> >>   this
> >> >> >>   loop.
> >> >> >>   unsigned NumRegUnitSubSets = RegUnitSets.size();
> >> >> >>   for (unsigned Idx = 0; Idx != RegUnitSets.size(); ++Idx) {
> >> >> >>     // In theory, this is combinatorial. In practice, it
> >> >> >>     needs
> >> >> >>     to
> >> >> >>     be
> >> >> >>     bounded
> >> >> >>     // by a small number of sets for regpressure to be
> >> >> >>     efficient.
> >> >> >>     // If the assert is hit, we need to implement pruning.
> >> >> >>     assert(Idx < (2*NumRegUnitSubSets) && "runaway unit set
> >> >> >>     inference");
> >> >> >> 
> >> >> >>     // Compare new sets with all original classes.
> >> >> >>     for (unsigned SearchIdx = (Idx >= NumRegUnitSubSets) ? 0
> >> >> >>     :
> >> >> >>     Idx+1;
> >> >> >>          SearchIdx != RegUnitSets.size(); ++SearchIdx) {
> >> >> >>       std::set<unsigned> Intersection;
> >> >> >>       std::set_intersection(RegUnitSets[Idx].Units.begin(),
> >> >> >>                             RegUnitSets[Idx].Units.end(),
> >> >> >>                             RegUnitSets[SearchIdx].Units.begin(),
> >> >> >>                             RegUnitSets[SearchIdx].Units.end(),
> >> >> >>                             std::inserter(Intersection,
> >> >> >>                             Intersection.begin()));
> >> >> >>       if (Intersection.empty())
> >> >> >>         continue;
> >> >> >> 
> >> >> >>       // Speculatively grow the RegUnitSets to hold the new
> >> >> >>       set.
> >> >> >>       RegUnitSets.resize(RegUnitSets.size() + 1);
> >> >> >>       RegUnitSets.back().Name =
> >> >> >>         RegUnitSets[Idx].Name + "+" +
> >> >> >>         RegUnitSets[SearchIdx].Name;
> >> >> >> 
> >> >> >>       std::set_union(RegUnitSets[Idx].Units.begin(),
> >> >> >>                      RegUnitSets[Idx].Units.end(),
> >> >> >>                      RegUnitSets[SearchIdx].Units.begin(),
> >> >> >>                      RegUnitSets[SearchIdx].Units.end(),
> >> >> >>                      std::inserter(RegUnitSets.back().Units,
> >> >> >>                                    RegUnitSets.back().Units.begin()));
> >> >> >> 
> >> >> >>       // Find an existing RegUnitSet, or add the union to
> >> >> >>       the
> >> >> >>       unique
> >> >> >>       sets.
> >> >> >>       std::vector<RegUnitSet>::const_iterator SetI =
> >> >> >>         findRegUnitSet(RegUnitSets, RegUnitSets.back());
> >> >> >>       if (SetI != llvm::prior(RegUnitSets.end()))
> >> >> >>         RegUnitSets.pop_ back();
> >> >> >>     }
> >> >> >>   }
> >> >> >> 
> >> >> >> [ reply ] [-] Comment 1 Hal Finkel 2014-06-25 05:02:27 CDT
> >> >> >> Thanks
> >> >> >> for
> >> >> >> filing this bug; do you know if this is still a problem in
> >> >> >> llvm
> >> >> >> trunk? If so, the best way to provide this fix is to send a
> >> >> >> patch
> >> >> >> to
> >> >> >> the llvm-commits mailing list. That is how we normally
> >> >> >> review
> >> >> >> proposed patches.
> >> >> >> http://llvm.org/docs/DeveloperPolicy.html#making-and-submitting-a-patch
> >> >> >> Also, if you know how to generate a test that triggers the
> >> >> >> incorrect
> >> >> >> behavior, that would be great!
> >> >> >> 
> >> >> >> 
> >> >> >> _______________________________________________
> >> >> >> llvm-commits mailing list
> >> >> >> llvm-commits at cs.uiuc.edu >>
> >> >> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >> >> >> 
> >> >> >
> >> >> >--
> >> >> >Hal Finkel
> >> >> >Assistant Computational Scientist
> >> >> >Leadership Computing Facility
> >> >> >Argonne National Laboratory
> >> >> 
> >> >> 
> >> >
> >> >--
> >> >Hal Finkel
> >> >Assistant Computational Scientist
> >> >Leadership Computing Facility
> >> >Argonne National Laboratory
> >> 
> >> 
> >> 
> >> 
> >
> >--
> >Hal Finkel
> >Assistant Computational Scientist
> >Leadership Computing Facility
> >Argonne National Laboratory
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory