[llvm-dev] LLVM GPU News Issue #3, January 8 2021
Jakub (Kuba) Kuderski via llvm-dev
llvm-dev at lists.llvm.org
Fri Jan 8 09:00:35 PST 2021
The third issue of LLVM GPU News, a bi-weekly newsletter on all the GPU
things under the LLVM umbrella, is now available at:
I'm also pasting the content below, in case you prefer to read in your
# LLVM GPU News Issue #3, January 8 2021
Happy New Year! Welcome to the third issue of LLVM GPU News, a bi-weekly
newsletter on all the GPU things under the LLVM umbrella. This issue
covers the period from December 25 to January 7 2020.
We welcome your feedback and suggestions. Let us know if we missed anything
interesting, or want us to bring attention to your (sub)project, revisions
under review, or proposals. Please see the bottom of the page for details
on how to submit suggestions and contribute.
## Industry News and Conference Talks
* Alyssa Rosenzweig started a blog post series on [dissecting the Apple
M1 GPU](https://rosenzweig.io/blog/asahi-gpu-part-1.html), which
doesn't have any public documentation or open source drivers as of
writing. The goal is to understand the new architecture and accelerate
the development of an open source driver stack. Alyssa already
committed an [early work-in-progress disassembler
and described the methodology and wokflow used to develop it in the
## LLVM and Clang
* Madhur Amilkanthwar asked about [using the Attributor framework to
propagate the `amdgpu-flat-work-group-size` attribute in the AMDGPU
* (In-review) [Remove a custom `amdgpu-inline` pass and replace it with
new Target Transform Info hooks.](https://reviews.llvm.org/D94153)
As explained, this is because the custom inliner doesn't fit well into
the New Pass Manager's pipeline and has few differences compared to the
main LLVM inlining pass.
* [Clang won't add debugging information to NVPTX target if optimization
remarks are enabled.](https://reviews.llvm.org/D94123) This is because
`ptxas` supports either debug builds with no optimizations or
optimized builds without debug info.
* [Always print error messages in the `libomptarget` CUDA
plugin](https://reviews.llvm.org/D94263), even with debugging
* Make the [AMDGPU OpenMP target call into `deviceRTL` instead of
This allows simple OpenMP code to run without ROCm device libraries
* Lenny Guo asked for [help with generating SPIR-V binaries from the
SPIR-V MLIR dialect kernels](
in order to run them with OpenCL runtime. There are no replies as of
## External Compilers
Please submit pointers to your mailing lists, forums, or newsletters if you
want your LLVM- or MLIR-based GPU compiler project to be covered in future
LLVM GPU News issues.
* The graphics API-agnostic LGC peephole optimizer learned to
[fold `inttoptr ( add x, const )` into
`gep ( inttoptr x, const )`](
This improves value tracking and facilitates load/store vectorization.
LLVM's instruction combining pass cannot generally perform the same
optimization, because on some systems `const` itself may be a valid
* [Always split typed vertex buffer loads on AMD GFX6 and GFX10+](
in RADV/LLVM. This fixes hangs in [Zink](
(an OpenGL over Vulkan implementation) tests.
* [Intel's oneAPI DPC++ Compiler 2020-12 got released.](
The release notes contain a long list of SYCL compiler and library
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev