[PATCH] D31124: AMDGPU/SI: Add lane tracking to SI Scheduler

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 22 11:03:28 PDT 2017


rampitec added inline comments.


================
Comment at: lib/Target/AMDGPU/SIMachineScheduler.cpp:1653
+  for (const auto RegPair : Regs) {
+    for (const auto RegPairRes : getPairsForReg(RegPair.RegUnit,
+                                                RegPair.LaneMask)) {
----------------
axeldavy wrote:
> rampitec wrote:
> > Can you please avoid copy here and in function's arguments?
> Right, I can get
> SIScheduleBlockScheduler::getPairsForRegs(const SmallVector<RegisterMaskPair, 8> &Regs)
> {
>   SmallVector<RegisterMaskPair, 8> Result;
>   for (const auto &RegPair : Regs) {
> 
> But for the line with:
> for (const auto RegPairRes : getPairsForReg(RegPair.RegUnit,
> 
> Do you mean just use &RegPairRes ?
> Daniel Berlin suggested to use std::transform in combination with a new vector_inserter to replace the two loops.
> That wouldn't prevent the fact there is in all cases a copy from getPairsForReg results into the result tab of getPairsForRegs.
> 
> I could probably solve the issue by adding a new getPairsForReg function that takes the Result array to append to, and using that helper with the two functions. Do you think that would be a good solution ?
Yes, it sounds more efficient to me.


Repository:
  rL LLVM

https://reviews.llvm.org/D31124





More information about the llvm-commits mailing list