<div dir="ltr"><div>Hi there,</div><div><br></div><div>I would like to propose addition of two static analyses to LLVM framework that can help detect performance issues in GPU programs: The first analysis directly detects the issue with memory congestion across GPU threads; the second analysis checks independence for block-size for synchronization-free programs that allows performance tuning of block-size without impacting correctness. Both these static analyses were developed as part of my PhD thesis and are available on github. Please see the link here to see more details:</div><div><br></div><div><a href="https://github.com/upenn-acg/gpudrano-static-analysis_v1.0">https://github.com/upenn-acg/gpudrano-static-analysis_v1.0</a></div><div><br></div><div>We would like to upstream these analyses to LLVM. There are many advantages of the work. These are ground-breaking analyses that allow light-weight compile-time detection of performance and correctness issues in GPU programs that concern <i>inter-thread </i>behavior. Being light-weight allows them to operate efficiently at compile-time. Inter-thread behavior of the program concerns the behaviors of the program that are observed due to the interaction between threads and are not local to individual threads. Such analysis is difficult to perform in a generic multi-threaded program, however due to the regularity of GPU parallelism, the analyses are feasible at compile-time.</div><div><br></div><div>These analyses can be the basis for optimizations that can improve the performance of GPU programs multifold. Given the complexity of GPU programming and the lack of support for tools in this space, the analyses provide the first steps towards robust tools for analysis and optimization of GPU programs. There are two publications that have been published for this work, which can be found in the references below. I would be happy to answer any questions or concerns regarding this work.</div><div><br></div><div>Regards,</div><div>Nimit</div><div><br></div><div>References:<br>1. FMSD 2021: Static analysis for detecting uncoalesced accesses in GPU
programs, Rajeev Alur, Joseph Devietti, Omar Navarro Leija, and Nimit
Singhania.</div><div>2. SAS 2018: Block-Size Independence of GPU Programs, Rajeev Alur, Joseph Devietti, and Nimit Singhania.
</div></div>