[PATCH] D141717: [Clang] Only emit textual LLVM-IR in device only mode
Joseph Huber via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Jan 13 12:39:02 PST 2023
jhuber6 added a comment.
In D141717#4052514 <https://reviews.llvm.org/D141717#4052514>, @tra wrote:
> Textual output for "-S -emit-llvm" is the canonical behavior, so I would prefer it working that way in as many cases as possible and only override it when necessary.
>
> Would it be possible to enforce binary IR generation in cases you need it? Or to prove that this is equivalent to what the patch does now?
Well you'll get textual output for the host section, but the device code embedded in the host module will be bitcode instead. So the final output from the compiler is still textual IR. It just won't be some weird global like this
@llvm.embedded.object = private constant [138032 x i8] c"\10\FF\10\AD\01\00\00\000\1B\02\00\00\00\00\00 \00\00\00\00\00\00\00(\00\00\00\00\00\00\00\02\00\01\00\00\00\00\00H\00\00\00\00\00\00\00\02\00\00\00\0 0\00\00\00\90\00\00\00\00\00\00\00\9D\1A\02\00\00\00\00\00n\00\00\00\00\00\00\00u\00\00\00\00\00\00\00i\00\00\00\00\00\00\00\87\00\00\00\00\00\00\00\00arch\00triple\00amdgcn-amd-amdhsa\00gfx90a\00\00\00; Mod uleID = 'tl2.c'\0Asource_filename = \22tl2.c\22\0Atarget datalayout = \22e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512- v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7\22\0Atarget triple = \22amdgcn-amd-amdhsa\22\0A\0A%struct.ident_t = type { i32, i32, i32, i32, ptr }\0A%struct.DeviceEnvironmentTy = type { i32, i32, i32, i32 }\0A %\22struct.ompx::state::TeamStateTy\22 = type { %\22struct.ompx::state::ICVStateTy\22, i32, i32, ptr }\0A%\22struct.ompx::state::ICVStateTy\22 = type { i32, i32, i32, i32, i32, i32 }\0A%\22struct.(anonymous namespace)::SharedMemorySmartStackTy\22 = type { [512 x i8], [1024 x i8] }\0A%\22struct.ompx::state::ThreadStateTy\22 = type { %\22struct.ompx::state::ICVStateTy\22, ptr }\0A\0A at __omp_rtl_assume_teams_oversu bscription = weak_odr hidden addrspace(1) constant i32 0\0A at __omp_rtl_assume_threads_oversubscription = weak_odr hidden addrspace(1) constant i32 0\0A at 0 = private unnamed_addr constant [23 x i8] c\22;unknown ;unknown;0;0;;\\00\22, align 1\0A at 1 = private unnamed_addr addrspace(1) constant %struct.ident_t { i32 0, i32 2, i32 0, i32 22, ptr @0 }, align 8\0A at __omp_offloading_16_5d6227e_thread_limit_l2_exec_mode = we ak protected addrspace(1) constant i8 1\0A at __omp_offloading_16_5d6227e_thread_limit_l5_exec_mode = weak protected addrspace(1) constant i8 1\0A at __omp_offloading_16_5d6227e_thread_limit_l8_exec_mode = weak pr otected addrspace(1) constant i8 1\0A at llvm.compiler.used = appending addrspace(1) global [3 x ptr] [ptr addrspacecast (ptr addrspace(1) @__omp_offloading_16_5d6227e_thread_limit_l2_exec_mode to ptr), ptr add rspacecast (ptr addrspace(1) @__omp_offloading_16_5d6227e_
This is bad because it can't be handled by LTO or anything else. It makes the resulting IR file difficult to use for its intended purpose.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D141717/new/
https://reviews.llvm.org/D141717
More information about the cfe-commits
mailing list