[PATCH] D136918: [AMDGPU] Scheduler: fix RP calculation for a MBB with one successor

Valery Pykhtin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Oct 29 02:51:18 PDT 2022


vpykhtin added a comment.

There're some difficulties with LIS however, example (reduced):

  0B bb.0.entry:    successors: %bb.4, %bb.1;
  ...
  320B	  S_CBRANCH_VCCNZ %bb.4, implicit killed $vcc
  
  336B bb.1:	; predecessors: %bb.0, successors: %bb.2;
  352B	  ...
  368B	  undef %47.sub2:vreg_96 = IMPLICIT_DEF
  
  432B bb.2.Flow:	; predecessors: %bb.4, %bb.1,  successors: %bb.3, %bb.5;
  512B	  ...
  ...
  624B	  S_BRANCH %bb.3
  
  640B bb.3.if:    	; predecessors: %bb.2, successors: %bb.5;
  656B	  ...
  672B	  ...
  720B	  %47:vreg_96 = IMAGE_SAMPLE_V3_V2 %48:vreg_64, %0:sgpr_256, %1:sgpr_128, 7, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s128) from custom "ImageResource")
  832B	  S_BRANCH %bb.5
  
  848B bb.4.else:       ; predecessors: %bb.0, successors: %bb.2;
  864B	  ...
  ...
  928B	  %47:vreg_96 = IMAGE_SAMPLE_V3_V2 %40:vreg_64, %0:sgpr_256, %1:sgpr_128, 7, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s128) from custom "ImageResource")
  ...
  1072B	  S_BRANCH %bb.2
  
  1088B bb.5.endif:	; predecessors: %bb.2, %bb.3
  1136B	  ...
  1168B	  EXP_DONE 0, %47.sub0:vreg_96, %47.sub1:vreg_96, %47.sub2:vreg_96, %49:vgpr_32, -1, 0, 15, implicit $exec
  1184B	  S_ENDPGM 0
  
  
  Live intervals for %47
  
  %47 [368r,432B:3)[432B,640B:1)[720r,848B:0)[928r,1088B:4)[1088B,1168r:2) 0 at 720r 1 at 432B-phi 2 at 1088B-phi 3 at 368r 4 at 928r  
  
  L0000000000000030 [368r,432B:3)[432B,640B:1)[720r,848B:0)[928r,1088B:4)[1088B,1168r:2) 0 at 720r 1 at 432B-phi 2 at 1088B-phi 3 at 368r 4 at 928r  
  L000000000000000C [432B,640B:1)[720r,848B:0)[928r,1088B:4)[1088B,1168r:2) 0 at 720r 1 at 432B-phi 2 at 1088B-phi 3 at x 4 at 928r  
  L0000000000000003 [432B,640B:1)[720r,848B:0)[928r,1088B:4)[1088B,1168r:2) 0 at 720r 1 at 432B-phi 2 at 1088B-phi 3 at x 4 at 928r

bb.2.Flow has two predecessors bb.4 and bb.1:

- at the end of bb.1 %47 has live mask 0x30  - L0000000000000030 [368r,432B:3) due to undef %47.sub2:vreg_96 = IMPLICIT_DEF

- at the beginning of bb.2.Flow %47 has 0x3F lane mask due to 928B	  %47:vreg_96 = IMAGE_SAMPLE_V3_V2  ...

In this example we're tracking bb.1 first and going to continue in bb.2.Flow with the liveouts left from bb.1 but liveout for %47 has 0x30 mask and livein should have 0x3F.

I'm not sure if the undef definition with subreg should be treated as a whole reg def because LIS doesn't think so. However there is a different point of view:

  llvm/lib/CodeGen/RegisterPressure.cpp: line 541
  
  void collectOperandLanes(const MachineOperand &MO) const {
  ...
        // Treat read-undef subreg defs as definitions of the whole register.
        if (MO.isUndef())
          SubRegIdx = 0;




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136918/new/

https://reviews.llvm.org/D136918



More information about the llvm-commits mailing list