[clang] [llvm] [clang][OpenMP] Improve loop structure for distributed loops (PR #201670)

Robert Imschweiler via cfe-commits cfe-commits at lists.llvm.org
Mon Jun 8 11:57:27 PDT 2026


ro-i wrote:

I realized that I had a bit of a testing issue. From my reduction tests, I kept the result verification for every test run (because I always wanted to have more testing guards against race conditions etc). But in my new non-reduction test cases, that hurts testing speed because the checks are O(n). Due to that fact, I previously only tested small N (4,096 or 65,535) instead of my usual default 177,777,777.

With that N, we get the following perf change for **non-reduction** workloads:
```
misc_stencil           double   change for 208 teams:   +47.45%   change for 10400 teams:   -30.69%
misc_elem_func         double   change for 208 teams:   +42.02%   change for 10400 teams:   +63.05%
misc_elem_loop         double   change for 208 teams:   +32.79%   change for 10400 teams:   -19.86%
misc_linalg            double   change for 208 teams:   +31.96%   change for 10400 teams:   -20.79%
misc_particle          double   change for 208 teams:   +13.89%   change for 10400 teams:    +3.09%
misc_stencil           uint     change for 208 teams:   +36.12%   change for 10400 teams:    -0.79%
misc_elem_func         uint     change for 208 teams:  +117.16%   change for 10400 teams:   +16.39%
misc_elem_loop         uint     change for 208 teams:   +37.26%   change for 10400 teams:   +26.12%
misc_linalg            uint     change for 208 teams:   +36.46%   change for 10400 teams:   +23.19%
misc_particle          uint     change for 208 teams:   +10.88%   change for 10400 teams:    -0.22%
misc_stencil           ulong    change for 208 teams:   +45.55%   change for 10400 teams:   -31.35%
misc_elem_func         ulong    change for 208 teams:   +39.18%   change for 10400 teams:   +66.38%
misc_elem_loop         ulong    change for 208 teams:   +38.42%   change for 10400 teams:   -23.92%
misc_linalg            ulong    change for 208 teams:   +37.38%   change for 10400 teams:   -24.18%
misc_particle          ulong    change for 208 teams:   +13.16%   change for 10400 teams:    +1.76%
misc_stencil           Value    change for 208 teams:    -3.31%   change for 10400 teams:    -0.76%
misc_elem_func         Value    change for 208 teams:    +0.55%   change for 10400 teams:    +1.42%
misc_elem_loop         Value    change for 208 teams:    -1.26%   change for 10400 teams:    -2.87%
misc_linalg            Value    change for 208 teams:    -0.73%   change for 10400 teams:   -15.36%
misc_particle          Value    change for 208 teams:    -1.15%   change for 10400 teams:    -0.12%
``` 

There is probably potential, but I'll change this PR to only handle the reduction cases for now. The other cases would need more analysis to get the most out of it and I need to focus on cross-team reduction for the moment.

https://github.com/llvm/llvm-project/pull/201670


More information about the cfe-commits mailing list