[PATCH] D40547: AMDGPU: Fix copying i1 value out of loop with non-uniform exit

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 26 09:30:48 PDT 2018


arsenm added inline comments.


================
Comment at: lib/Target/AMDGPU/SILowerI1Copies.cpp:146
+              &AMDGPU::SGPR_64RegClass) &&
+            AMDGPU::laneDominates(DefInst->getParent(), &MBB)) {
           BuildMI(MBB, &MI, DL, TII->get(AMDGPU::S_AND_B64))
----------------
I think the info from the commit message needs to be added here


================
Comment at: lib/Target/AMDGPU/Utils/AMDGPULaneDominator.cpp:52
+// The check is conservative, i.e. there can be false-negatives.
+bool laneDominates(MachineBasicBlock *A, MachineBasicBlock *B) {
+  // Check whether A is reachable from itself without going through B.
----------------
We keep repeating essentially the same control flow depth first search in various places, but I don't have a better idea until we really fix the control flow situation


================
Comment at: test/CodeGen/AMDGPU/i1-copy-from-loop.ll:1-2
+; RUN: llc -march=amdgcn -verify-machineinstrs < %s | FileCheck -check-prefix=SI %s
+; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck -check-prefix=SI %s
+
----------------
s/SI/GCN


https://reviews.llvm.org/D40547





More information about the llvm-commits mailing list