[llvm] [CHR] Skip regions containing convergent calls (PR #180882)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 11 00:58:16 PST 2026
================
@@ -0,0 +1,179 @@
+; Test that CHR does not transform regions containing convergent or
+; noduplicate calls, following the same guard as SimplifyCFG.
+;
+; CHR (Control Height Reduction) merges multiple biased branches into a
+; single speculative check, cloning the region into hot/cold paths. On GPU
+; targets, this merged branch may be divergent (per-thread), splitting the
+; wavefront: some threads take the hot path, others the cold path.
+;
+; A convergent call like ds_bpermute (a cross-lane operation on AMDGPU)
+; requires a specific set of threads to be active — when thread X reads
+; from thread Y via ds_bpermute, thread Y must be active and participating
+; in the same call. After CHR cloning, thread Y may have gone to the cold
+; path while thread X is on the hot path, so the hot-path ds_bpermute reads
+; a stale register value from thread Y instead of the intended value.
+;
+; Similarly, noduplicate calls must not be duplicated by definition.
+;
+; RUN: opt < %s -passes='require<profile-summary>,function(chr)' -S | FileCheck %s
+
+target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9"
----------------
arsenm wrote:
```suggestion
```
https://github.com/llvm/llvm-project/pull/180882
More information about the llvm-commits
mailing list