[PATCH] D49288: [AMDGPU] run post-RA hazard recognizer pass after memory legalizer, waitcnt passes

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jul 13 05:33:54 PDT 2018


arsenm added inline comments.


================
Comment at: lib/Target/AMDGPU/AMDGPUTargetMachine.cpp:902
 
-  addPass(createSIMemoryLegalizerPass());
-  addPass(createSIInsertWaitcntsPass());
   addPass(createSIShrinkInstructionsPass());
   addPass(&SIInsertSkipsPassID);
----------------
msearles wrote:
> arsenm wrote:
> > I'm surprised we run shrink after this point.
> > 
> > I think this can plausibly be an issue.
> > Suppose you had something like
> > %vgpr0 = V_FOO
> > <some hazard>
> > %vgpr1 = V_MOV_B32 5
> > %vgpr2 = V_BAR %vgpr1
> > 
> > The immediate materialize would eliminate a wait state that wasn't necessary, but now is since the copy is eliminated.  I think in practice the shrink pass isn't very good at folding immediates post-RA, if it even tries
> I will move the hazard recognizer after shrink instructions, unless you're suggesting that the shrink pass be moved somewhere else? I think that moving the hazard recognizer after the shrink pass has the side-effect that some S_NOP 0's won't be collapsed into, say S_NOP <N>. The hazard recognizer just creates individual S_NOP 0 instrs and is relying on the shrink pass to turn them into S_NOP <N>.  I suppose the hazard recognizer could be enhanced to generate the S_NOP <N> instrs on its own.
Yes, the hazard recognizer ideally would do that. The shrink pass deals with this as a workaround for it seeming to assume all nops are the same.

It's probably OK to leave as-is for now


https://reviews.llvm.org/D49288





More information about the llvm-commits mailing list