[PATCH] D81905: Enhance Itanium demangler interface.

Thu Jun 25 12:29:08 PDT 2020

hfinkel added a comment.

In D81905#2113158 <https://reviews.llvm.org/D81905#2113158>, @kbobrovs wrote:

> The patch by itself is just ItaniumDemangler.h interface enhancements (+ compilation error fix - `objcProto->getProtocol()`) :-)
>  If the change is OK by itself (not taking into account potential future uses), I would appreciate approval.

Please upload the patch with full context.

> Still I would like to find the best solution for the intrinsic translation (also because we'll come to llvm.org some day with upstream request :-) ), so let me try to explain what I'm doing exactly and understand objections in more details. Or maybe my explanation will address them. 
>  This IR pass
> 
> - is not a part of LLVM optimizer, instead it is invoked as clang code generation "epilog", so LLVM optimizer passes don't see/interpret the original C++ names, they operate on usual IR with "@llvm.xxx" intrinsics
> - is doing basically the same thing done by clang when it lowers built-ins: given the built-in name, it translates it to LLVM IR
> 
>   The main difference from clang is that the built-in name is a C++ mangled name. For example, in our SIMD library we declare the following C++ intrinsic:
> 
>   ``` template < sycl::intel::gpu::EsimdAtomicOpType Op, typename Ty, int N, sycl::intel::gpu::CacheHint L1H = sycl::intel::gpu::CacheHint::Default, sycl::intel::gpu::CacheHint L3H = sycl::intel::gpu::CacheHint::Default> SYCL_EXTERNAL sycl::intel::gpu::vector_type_t<Ty, N> __esimd_flat_atomic0(sycl::intel::gpu::vector_type_t<uint64_t, N> addrs, sycl::intel::gpu::vector_type_t<uint16_t, N> pred); ``` and then use the `__esimd_flat_atomic0` in the C++ library code to implement higher-level operations. Particular instantiation of  `__esimd_flat_atomic0` appears as C++ mangled name in the LLVM IR output of the clang after CodeGen. Before FE finishes and hand's IR off to the optimizer, we run a pass which "lowers" those calls to "@llvm.xxx" intrinsic calls. There is no actually any real "semantics interpretation", just intrinsic translation, and this is entirely encapsulated into the Intel GPU-specific IR pass. So I think those things @hfinkel  expressed concerns about won't happen:
> 
>> That would mean that passes that change function names (outliners, specializers, etc.) need to be aware of special rules not to disturb those encoded semantics
> 
> This IR pass is part of the Front End.
> 
> If I understand the suggestion to use attributes correctly, that would mean the following:
>  Suppose there are two instantiations of `__esimd_flat_atomic0` and clang CodeGen produced a function declaration `@AAA__esimd_flat_atomic0_XXX` for the first one and `@BBB__esimd_flat_atomic0_YYY` - for the second. Then clang FE CodeGen should also emit attributes encoding all the template parameters of each instantiation. Then there would be and IR lowering pass which would be the same as my IR lowering pass except that it would interpret <name, attributes> information to generate proper "@llvm.xxx" intrinsic instead of using demangler for that. So it seems with attribute-based approach there would be more code and changes to two components (FE + LLVM pass) instead of one (LLVM Pass).

I'm inclined to say that we should just enhance Clang to support templated builtins, and then in CGBulitin, just generate the correct LLVM intrinsic directly. No IR pre-processing pass required.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81905/new/

https://reviews.llvm.org/D81905