[llvm-dev] LLVM GPU News #26, January 14 2022

Jakub (Kuba) Kuderski via llvm-dev llvm-dev at lists.llvm.org
Sun Jan 16 14:23:32 PST 2022


Hi folks,

The 26th issue of LLVM GPU News, a bi-weekly newsletter on all the GPU
things under the LLVM umbrella, is out: <
https://llvm-gpu-news.github.io/2022/01/14/issue-26.html>.
I also paste the content below, in case you prefer to read in your email
client.

-Jakub

======================================================================

# LLVM GPU News #26, January 14 2021
Authors: Jakub Kuderski, Alexey Bader, Joseph Huber, Lei Zhang

Welcome to LLVM GPU News, a bi-weekly newsletter on all the GPU things
under the LLVM umbrella.
This issue covers the period from December 17 to January 13 2022.

We welcome your feedback and suggestions. Let us know if we missed anything
interesting, or want us to bring attention to your (sub)project, revisions
under review, or proposals. Please see the bottom of the page for details
on how to submit suggestions and contribute.


## Industry News and Conferences

*  The first LLVM GPU Working Group Meeting was on January 14. All people
interested in GPU/offloading-related development in LLVM are welcome to
join. See the [document with the meeting scheduling information, agenda,
and notes](
https://docs.google.com/document/d/1m_oSe1HwtWdQ2JUmMRTAVHbUS7Dv4MRsqptiYcgK6iI)
for more details.
   *  We are running polls on the meeting time and frequency.


##  LLVM and Clang

### Discussions

*  Discussion on how to improve math optimizations on GPUs as part of the
first [LLVM GPU working group meeting](
https://docs.google.com/document/d/1m_oSe1HwtWdQ2JUmMRTAVHbUS7Dv4MRsqptiYcgK6iI/edit#heading=h.k8ma1s401a1
).
*  Anastasia Stulova posted an [RFC: 'Add linking of separate translation
units using `spirv-link`'](
https://lists.llvm.org/pipermail/cfe-dev/2022-January/069658.html).
`spirv-link` is an external tool provided by the [SPIRV-Tools project](
https://github.com/KhronosGroup/SPIRV-Tools#linker). This landed as
[D116266](https://reviews.llvm.org/D116266).
*  'huoshanl' asked [why the AMDGPU backend dropped the
`GCNRegBankReassign` pass](
https://llvm.discourse.group/t/gcnregbankreassign-pass-issue/5303). There
are no replies at the time of writing.

### Commits

*  A SPIR-V toolchain was added to Clang. SPIR-V code is generated by the
external SPIRV-LLVM translator tool `llvm-spirv`, as a temporary solution
until a SPIR-V backend lands in LLVM. [D112410](
https://reviews.llvm.org/D112410)
*  The OpenCL documentation was updated with C++ for OpenCL 2021 support in
Clang. [D116271](https://reviews.llvm.org/D116271)
*  CUDA/HIP now allow `__int128` on the host side, even if not supported by
the target device. [D111047](https://reviews.llvm.org/D111047)
*  NVPTX intrinsics and builtins for CUDA PTX `cvt` instructions were added
for sm80 architectures and above. [D116673](https://reviews.llvm.org/D116673
)
*  Enabled AMDGPU divergence predicates for `not`, `min`/`max`, and
`ctlz`/`cttz`. [D115884](https://reviews.llvm.org/D115884), [D115954](
https://reviews.llvm.org/D115954), [D116044](
https://reviews.llvm.org/D116044)


## MLIR

### Discussions

### Commits

*  The `gpu.printf` op is defined to support debugging. [D110448](
https://reviews.llvm.org/D110448)
*  GPU kernel outlining supports the datalayout spec attribute (`dlspec`)
now. [D115722](https://reviews.llvm.org/D115722)
*  A few SPIR-V serialization bugs regarding nested control flows were
fixed. [D115560](https://reviews.llvm.org/D115560), [D115582](
https://reviews.llvm.org/D115582)
*  SPIR-V serialization allows explicit control over debug information
emission. [D115531](https://reviews.llvm.org/D115531)


## OpenMP (Target Offloading)

### Discussions

*  The implementation of the new OpenMP offloading driver was discussed,
[slides](
https://docs.google.com/presentation/d/1QXKSdBWhLaUHyrI-dgd2yHMux3w_q2EF2sROyO0u52k
).

### Commits

*  100% of OvO tests pass on AMDGPU after applying [D116906](
https://reviews.llvm.org/D116906) with optimizations.
*  The new OpenMP offloading driver is up for review at [D116541](
https://reviews.llvm.org/D116541) (click 'Stack' to see the full list of
revisions) and can be pulled from [here](
https://github.com/jhuber6/llvm-project/tree/NewDriver). It currently
supports the following features:
   *  Support for AMDGPU and NVPTX offloading targets.
   *  New offloading object files are compatible with the host linker.
   *  Functional static linking using archive libraries.
   *  Device-side LTO for offloading applications.
   *  Embedding LLVM IR for future JIT functionality.


## External Compilers

### LLPC

*  Added multi-threaded compilation support to the standalone compiler tool
`amdllpc`. The main goal is to exercise threading in the compiler in the
CI, without having to run the full AMDVLK Vulkan driver on a machine with a
GPU. [LLPC#1601](https://github.com/GPUOpen-Drivers/llpc/pull/1601)
*  A new GitHub Actions workflow was added to automatically produce code
coverage reports. Links to coverage reports are posted as Pull Request
comments. [See a sample report.](
https://github.com/GPUOpen-Drivers/llpc/pull/1620#issuecomment-1010621514)
[LLPC#1627](https://github.com/GPUOpen-Drivers/llpc/pull/1627), [LLPC#1629](
https://github.com/GPUOpen-Drivers/llpc/pull/1629)

### oneAPI DPC++

#### CUDA/HIP support

*  Added group collective functions (reduce, scans, broadcast) for HIP.
[#5202](https://github.com/intel/llvm/pull/5202)
*  Added HIP backend support to filter selector extension. [#5171](
https://github.com/intel/llvm/pull/5171), [#5176](
https://github.com/intel/llvm/pull/5176)
*  Improved queue barrier support on HIP backend. [#4975](
https://github.com/intel/llvm/pull/4975)
*  Made a number of small functional fixes improving device information,
support for stream, hierarchical parallelism, etc. [#4951](
https://github.com/intel/llvm/pull/4951), [#5168](
https://github.com/intel/llvm/pull/5168), [#5293](
https://github.com/intel/llvm/pull/5293), [#5115](
https://github.com/intel/llvm/pull/5115)
*  Added `-fcuda-prec-sqrt` flag enabling correctly rounded results of the
`sqrt` function on CUDA backend (equivalent to the nvcc `-prec-sqrt`).
[#5141](https://github.com/intel/llvm/pull/5141)
*  Enabled in-kernel asserts support for the CUDA backend. [#5174](
https://github.com/intel/llvm/pull/5174)
*  Fixed the `nan()` builtin for double types on the CUDA backend. [#5173](
https://github.com/intel/llvm/pull/5173)
*  Fixed out-of-bound behavior for `read_image` in none addressing mode on
the CUDA backend. [#5204](https://github.com/intel/llvm/pull/5204)

#### SYCL 2020 support

*  Added basic support for the `generic_space` address space. [#5148](
https://github.com/intel/llvm/pull/5148)
*  Added property list support to the `stream` class. [#4898](
https://github.com/intel/llvm/pull/4898)

#### Non-standard extensions

*  ESIMD: Added infrastructure to support non-standard C++ types
`sycl::half`, `sycl::bfloat`, etc., and basic operations support for
`sycl::half`. [#5123](https://github.com/intel/llvm/pull/5123)
*  ESIMD: Enabled a number of math and conversion intrinsics for
`sycl::half`. [#5271](https://github.com/intel/llvm/pull/5271)
*  Matrix: Enabled `joint_matrix_fill` ([#4994](
https://github.com/intel/llvm/pull/4994) and [#5277](
https://github.com/intel/llvm/pull/5277)) and `wi_slice` ([#4979](
https://github.com/intel/llvm/pull/4979)) operations for `joint_matrix`.
*  Added new SPIR-V specification extensions documentation to support
"module" debug information ([#3976](https://github.com/intel/llvm/pull/3976))
and composite types as joint matrix elements ([#5228](
https://github.com/intel/llvm/pull/5228)).
*  Added new SYCL specification extensions documentation to support "device
global" objects ([#4686](https://github.com/intel/llvm/pull/4686)), compile
time properties ([#4937](https://github.com/intel/llvm/pull/4937)), and
`std::complex` data type in group collective algorithms ([#5108](
https://github.com/intel/llvm/pull/5108)).
*  Added initial support for format strings in non-constant address space
for the `printf` function. [#5069](https://github.com/intel/llvm/pull/5069)
*  Moved group sort extension to experimental namespace. [#5169](
https://github.com/intel/llvm/pull/5169)

#### Misc

*  Enhanced SYCL accessor ([#5249](https://github.com/intel/llvm/pull/5249))
and buffer ([#5161](https://github.com/intel/llvm/pull/5161))
instrumentation with XPTI and reduced XPTI instrumentation overhead [#5158](
https://github.com/intel/llvm/pull/5158).
*  Turned on the `-fsycl-dead-args-optimization` flag by default. [#3004](
https://github.com/intel/llvm/pull/3004)
*  Enabled SPIR-V device image format for fat objects in the driver.
[#4608](https://github.com/intel/llvm/pull/4608), [#5251](
https://github.com/intel/llvm/pull/5251), [#4683](
https://github.com/intel/llvm/pull/4683)
*  Made a number of performance improvements in the runtime library and
Level Zero plug-in like caching command lists ([#5197](
https://github.com/intel/llvm/pull/5197)), batching of copy commands
([#5155](https://github.com/intel/llvm/pull/5155)).
*  Added static linking of device code to the Level Zero backend. [#5266](
https://github.com/intel/llvm/pull/5266), [#5267](
https://github.com/intel/llvm/pull/5267)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20220116/1ab9a7c9/attachment.html>


More information about the llvm-dev mailing list