[PATCH] D141717: [Clang] Only emit textual LLVM-IR in device only mode

Joseph Huber via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Jan 13 12:39:02 PST 2023


jhuber6 added a comment.

In D141717#4052514 <https://reviews.llvm.org/D141717#4052514>, @tra wrote:

> Textual output for "-S -emit-llvm" is the canonical behavior, so I would prefer it working that way in as many cases as possible and only override it when necessary.
>
> Would it be possible to enforce binary IR generation in cases you need it? Or to prove that this is equivalent to what the patch does now?

Well you'll get textual output for the host section, but the device code embedded in the host module will be bitcode instead. So the final output from the compiler is still textual IR. It just won't be some weird global like this

  @llvm.embedded.object = private constant [138032 x i8] c"\10\FF\10\AD\01\00\00\000\1B\02\00\00\00\00\00 \00\00\00\00\00\00\00(\00\00\00\00\00\00\00\02\00\01\00\00\00\00\00H\00\00\00\00\00\00\00\02\00\00\00\0  0\00\00\00\90\00\00\00\00\00\00\00\9D\1A\02\00\00\00\00\00n\00\00\00\00\00\00\00u\00\00\00\00\00\00\00i\00\00\00\00\00\00\00\87\00\00\00\00\00\00\00\00arch\00triple\00amdgcn-amd-amdhsa\00gfx90a\00\00\00; Mod  uleID = 'tl2.c'\0Asource_filename = \22tl2.c\22\0Atarget datalayout = \22e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-  v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7\22\0Atarget triple = \22amdgcn-amd-amdhsa\22\0A\0A%struct.ident_t = type { i32, i32, i32, i32, ptr }\0A%struct.DeviceEnvironmentTy = type { i32, i32, i32, i32 }\0A  %\22struct.ompx::state::TeamStateTy\22 = type { %\22struct.ompx::state::ICVStateTy\22, i32, i32, ptr }\0A%\22struct.ompx::state::ICVStateTy\22 = type { i32, i32, i32, i32, i32, i32 }\0A%\22struct.(anonymous   namespace)::SharedMemorySmartStackTy\22 = type { [512 x i8], [1024 x i8] }\0A%\22struct.ompx::state::ThreadStateTy\22 = type { %\22struct.ompx::state::ICVStateTy\22, ptr }\0A\0A at __omp_rtl_assume_teams_oversu  bscription = weak_odr hidden addrspace(1) constant i32 0\0A at __omp_rtl_assume_threads_oversubscription = weak_odr hidden addrspace(1) constant i32 0\0A at 0 = private unnamed_addr constant [23 x i8] c\22;unknown  ;unknown;0;0;;\\00\22, align 1\0A at 1 = private unnamed_addr addrspace(1) constant %struct.ident_t { i32 0, i32 2, i32 0, i32 22, ptr @0 }, align 8\0A at __omp_offloading_16_5d6227e_thread_limit_l2_exec_mode = we  ak protected addrspace(1) constant i8 1\0A at __omp_offloading_16_5d6227e_thread_limit_l5_exec_mode = weak protected addrspace(1) constant i8 1\0A at __omp_offloading_16_5d6227e_thread_limit_l8_exec_mode = weak pr  otected addrspace(1) constant i8 1\0A at llvm.compiler.used = appending addrspace(1) global [3 x ptr] [ptr addrspacecast (ptr addrspace(1) @__omp_offloading_16_5d6227e_thread_limit_l2_exec_mode to ptr), ptr add  rspacecast (ptr addrspace(1) @__omp_offloading_16_5d6227e_

This is bad because it can't be handled by LTO or anything else. It makes the resulting IR file difficult to use for its intended purpose.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141717/new/

https://reviews.llvm.org/D141717



More information about the cfe-commits mailing list