[llvm-dev] LLVM GPU News #19, September 10 2021

Sat Sep 11 14:35:55 PDT 2021

Hi folks,

The 19th issue of LLVM GPU News, a bi-weekly newsletter on all the GPU
things under the LLVM umbrella, is out: <
https://llvm-gpu-news.github.io/2021/09/10/issue-19.html>.

I also pasted the content below, in case you prefer to read in your email
client.

-Jakub

======================================================================

# LLVM GPU News #19, Sep 10 2021
Authors: Jakub Kuderski, Lei Zhang, Johannes Doerfert, Giorgis
Georgakoudis, Joseph Huber

Welcome to LLVM GPU News, a bi-weekly newsletter on all the GPU things
under the LLVM umbrella.
This issue covers the period from August 20 to September 9 2021.

We welcome your feedback and suggestions. Let us know if we missed anything
interesting, or want us to bring attention to your (sub)project, revisions
under review, or proposals. Please see the bottom of the page for details
on how to submit suggestions and contribute.

## Industry News and Conferences
*  Two new paper on using MLIR for GPU compilation:
   -  ["Domain-Specific Multi-Level IR Rewriting for GPU: The Open Earth
Compiler for GPU-accelerated Climate Simulation"](
https://dl.acm.org/doi/10.1145/3469030)
   -  ["High Performance GPU Code Generation for Matrix-Matrix
Multiplication using MLIR: Some Early Results"](
https://arxiv.org/abs/2108.13191)

##  LLVM and Clang

### Discussions

*  Alexander Yermolovich hit an issue with [relocation overflows from
`.text` section into `.nv_fatbin`](
https://lists.llvm.org/pipermail/llvm-dev/2021-September/152533.html) in
the LLD linker. The following discussion revolves around section reordering
and the CUDA ELF binary format.

### Commits

*  Clang now reports CUDA 11.4 as fully supported. Not all features offered
by NVCC are actually supported, but Clang is expected
to handle CUDA headers and produce binaries for all GPUs supported by NVCC.
The default GPU architecture is now `sm_35`. [D108239](
https://reviews.llvm.org/D108239), [D108248](
https://reviews.llvm.org/D108248), [D108235](
https://reviews.llvm.org/D108235)
*  Various AMDGPU MIR peephole optimizations for comparison instructions.
*  Various AMDGPU attribute handling and propagation improvements.

## MLIR

### Discussions

*  Xuanhuo asked about [the meaning of `gpu.all_reduce`](
https://llvm.discourse.group/t/whats-the-meaning-of-gpu-all-reduce/4158).
Alex Zinenko explained how this relates to collective operations thread
index linearization.

### Commits

*  A GPU memset op is [introduced](https://reviews.llvm.org/D107548) for
CUDA and ROCm.
*  Weiwei Li started to improve how image operands are represented in the
SPIR-V dialect for graphics use cases.

## OpenMP (Target Offloading)

### Discussions

*  Problems that result from mixing non-debug and debug code have been
discussed as part of [PR51737](https://bugs.llvm.org/show_bug.cgi?id=51737).
*  The outdated [openmp.llvm.org](https://openmp.llvm.org) webpage has been
replaced by the [new "documentation" page](https://openmp.llvm.org/docs).

### Commits

*  The SPMDzation optimization (introduced in [D102307](
https://reviews.llvm.org/D102307)) has been extended with guarding to
enlarge the scope of possible kernels amenable to the optimization, see
[D106892](https://reviews.llvm.org/D106892). Additionally, guarding has
been implemented more effectively to batch multiple side-effect
instructions in a single guarded region when they share the same block
being only separated by non-side-effect instructions, see [D109070](
https://reviews.llvm.org/D109070). Further, generic regions without any
parallelism are no longer transformed by SPMDzation to avoid unnecessary
guarding, see [D109438](https://reviews.llvm.org/D109438).
*  [OpenMP `assumes`](https://reviews.llvm.org/D105937) can now be used to
provide information for the OpenMP-Opt pass, what information is required
to perform an optimization is communicated via [optimization remarks](
https://openmp.llvm.org/docs/remarks/OptimizationRemarks.html).
*  OpenMP declare variant now works with [functions that use reference
types](https://reviews.llvm.org/D108774), this fixes a problem [reported
for certain C++ math functions](
https://lists.llvm.org/pipermail/openmp-dev/2021-August/004094.html).
*  Initial support for AMDGPU gfx10 offloading. [D108708](
https://reviews.llvm.org/D108708)

## External Compilers

### LLPC

*  [Geometry shaders](https://www.khronos.org/opengl/wiki/Geometry_Shader)
can now be compiled using [relocatable shaders](
https://github.com/GPUOpen-Drivers/llpc/blob/dev/docs/DdnRelocatableShaderElf.md).
[LLPC#1391](https://github.com/GPUOpen-Drivers/llpc/pull/1391)

### Mesa

-- 
Jakub Kuderski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210911/78bad47e/attachment.html>