[PATCH] D35967: [AMDGPU] Collapse adjacent SI_END_CF

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 2 08:38:35 PDT 2017


nhaehnle added inline comments.


================
Comment at: llvm/trunk/test/CodeGen/AMDGPU/collapse-endcf.ll:150
+; GCN-NEXT: {{^}}[[ENDIF_OUTER]]:
+; GCN-NEXT: s_or_b64 exec, exec, [[SAVEEXEC_OUTER3]]
+; GCN-NEXT: s_endpgm
----------------
rampitec wrote:
> rampitec wrote:
> > arsenm wrote:
> > > arsenm wrote:
> > > > We should also be stripping out exec modifications with no VALU instructions before s_endpgm
> > > Any scalar instruction really
> > Not a scalar store. Also not sure about barries and waits.
> And then not any instructions contributing to that scalar store.
Please keep SI_RETURN_TO_EPILOG in mind. For non-monolithic graphics shaders, returning with the correct EXEC mask is important (and we can't just dead-code-eliminate SALU instructions either). Basically, as long as it's for S_ENDPGM only, it should be okay.


Repository:
  rL LLVM

https://reviews.llvm.org/D35967





More information about the llvm-commits mailing list