[PATCH] D133840: AMDGPU: Add a pass to rewrite certain undef in PHI

Wed Sep 14 22:18:33 PDT 2022

sameerds added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURewriteUndefForPHI.cpp:11
+//
+// To achieve optimal code generation for AMDGPU, we request divergence analysis
+// to report the PHI in join block of divergent branch to be uniform if it has
----------------
I would say "we assume that the DA reports ..." rather than "expect".

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURewriteUndefForPHI.cpp:34
+// For pattern A, by reporting %phi as uniform, the later pipeline need to make
+// sure it be handled correctly. The backend usually allocates a scalar register
+// and if any thread in a wave takes %then path, the scalar register will get
----------------
Does "usually" mean that sometimes the backend will not allocate a scalar? In the case that it is allocated a vector register, then the generated code will be wrong?

================
Comment at: llvm/test/CodeGen/AMDGPU/rewrite-undef-for-phi.ll:63
+end:
+  %c2 = phi float [ undef, %bb2 ], [ %c, %bb3 ], [ undef, %bb4 ], [ %c, %entry ]
+  ret float %c2
----------------
Since DominateBB is only checked on defined values, here it is true that %entry dominates %bb3. Depending on how we traverse the incoming vaues, if DominateBB is initially %bb3, then later when IncomingBB is %entry, DominateBB does not dominate IncomingBB, but it is true in the opposite direction.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133840/new/

https://reviews.llvm.org/D133840