<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/54654>54654</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[OpenMP] Possibly incorrect heap-to-stack conversion
</td>
</tr>
<tr>
<th>Labels</th>
<td>
openmp
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
dhruvachak
</td>
</tr>
</table>
<pre>
libomptarget/test/offloading/bug51982.c is the test case.
% ~/git/aomp-trunk/llvm-project/build$ clang -fopenmp -fno-experimental-isel -L ./runtimes/runtimes-bins/openmp/libomptarget -L ./runtimes/runtimes-bins/openmp/runtime/src -fno-openmp-implicit-rpath -Wl,-rpath,./runtimes/runtimes-bins/openmp/libomptarget -Wl,-rpath,./runtimes/runtimes-bins/openmp/runtime/src --libomptarget-amdgcn-bc-path=./runtimes/runtimes-bins/openmp/libomptarget -fopenmp-targets=amdgcn-amd-amdhsa -O1 -Rpass=openmp-opt ../openmp/libomptarget/test/offloading/bug51982.c -o bug51982
../openmp/libomptarget/test/offloading/bug51982.c:12:1: remark: Rewriting generic-mode kernel with a customized state machine. [OMP131] [-Rpass=openmp-opt]
#pragma omp target map(tofrom : aa)
^
../openmp/libomptarget/test/offloading/bug51982.c:7:7: remark: Replaced globalized variable with 4 bytes of shared memory. [OMP111] [-Rpass=openmp-opt]
int ng = 12;
^
../openmp/libomptarget/test/offloading/bug51982.c:14:1: remark: Moving globalized variable to the stack. [OMP110] [-Rpass=openmp-opt]
#pragma omp parallel for
^
% ~/git/aomp-trunk/llvm-project/build$ ./bug51982
Aborted (core dumped)
I started with [D102107](https://reviews.llvm.org/D102107), rebased it on top of trunk, and tested it on amdgpu. The above test case fails at -O1 but passes at -O2. While debugging the -O1 failure, I found what appears to be an incorrect heap-to-stack conversion in OpenMPOpt phase. Here are the steps to repro:
(1) Use -v -save-temps that produces an intermediate bug51982-openmp-amdgcn-amd-amdhsa-gfx906-linked.bc. (Interestingly, usage of -save-temps adds the -disable-llvm-passes to the clang-15 command line, so I used opt to reproduce the above conversion. I verified that the same issue shows up with the clang-15 invocation if -disable-llvm-passes is removed.)
(2) In my environment, pass #158 is OpenMPOpt, so I ran it with bisect as below to get the .bc files before and after OpenMPOpt. Attached is a .ll file that you can use to generate the .bc file used for opt below. (github won't let me upload .ll file, so I renamed it to .txt)
% opt bug51982-openmp-amdgcn-amd-amdhsa-gfx906-linked.bc -O1 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx906 -o bug51982-openmp-amdgcn-amd-amdhsa-gfx906-gfx906-optimized.bc -opt-bisect-limit=157
% opt bug51982-openmp-amdgcn-amd-amdhsa-gfx906-linked.bc -O1 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx906 -o bug51982-openmp-amdgcn-amd-amdhsa-gfx906-gfx906-optimized.bc -opt-bisect-limit=158
Comparing the pre- and post-IR, I see this difference (among others):
pre:
for.body: ; preds = %for.cond
%2 = addrspacecast %struct.anon addrspace(5)* %.tmp.outlined.agg.arg to i8*, !dbg !25
%3 = call i8* @__kmpc_alloc_shared(i64 16), !dbg !25
%4 = call i8* @__kmpc_alloc_aggregate_arg(i8* %2, i8* %3), !dbg !25
%.fca.0.insert = insertvalue %struct.anon poison, i32* %ng_on_stack, 0, !dbg !25
%.fca.1.insert = insertvalue %struct.anon %.fca.0.insert, i64* %aa, 1, !dbg !25
%5 = bitcast i8* %4 to %struct.anon*, !dbg !25
store %struct.anon %.fca.1.insert, %struct.anon* %5, align 8, !dbg !25
call void @__kmpc_parallel_51(%struct.ident_t* addrspacecast (%struct.ident_t addrspace(1)* @6 to %struct.ident_t*), i32 %1, i32 1, i32 -1, i32 -1, i8* bitcast (void (i32*, i32*, %struct.anon*)* @__omp_outlined__ to i8*), i8* bitcast (void (i16, i32)* @__omp_outlined___wrapper to i8*), i8* %4), !dbg !25
call void @__kmpc_free_shared(i8* %3, i64 16), !dbg !25
%inc = add nsw i32 %gid.0, 1, !dbg !26
br label %for.cond, !dbg !23, !llvm.loop !2
post:
for.body: ; preds = %for.cond
%2 = addrspacecast %struct.anon addrspace(5)* %.tmp.outlined.agg.arg to i8*, !dbg !24
%3 = alloca i8, i64 16, align 1, addrspace(5)
%malloc_cast = addrspacecast i8 addrspace(5)* %3 to i8*
%4 = call i8* @__kmpc_alloc_aggregate_arg(i8* %2, i8* nocapture nofree %malloc_cast) #23, !dbg !24
%.fca.0.insert = insertvalue %struct.anon poison, i32* %ng_on_stack, 0, !dbg !24
%.fca.1.insert = insertvalue %struct.anon %.fca.0.insert, i64* %aa, 1, !dbg !24
%5 = bitcast i8* %4 to %struct.anon*, !dbg !24
store %struct.anon %.fca.1.insert, %struct.anon* %5, align 8, !dbg !24
call void @__kmpc_parallel_51(%struct.ident_t* addrspacecast (%struct.ident_t addrspace(1)* @6 to %struct.ident_t*), i32 %1, i32 1, i32 -1, i32 -1, i8* bitcast (void (i32*, i32*, %struct.anon*)* @__omp_outlined__ to i8*), i8* bitcast (void (i16, i32)* @__omp_outlined___wrapper to i8*), i8* %4) #22, !dbg !24
%inc = add nsw i32 %gid.0, 1, !dbg !25
br label %for.cond, !dbg !22, !llvm.loop !26
--------------
Both are in generic mode
@__omp_offloading_802_1f42f03_main_l12_exec_mode = weak addrspace(1) constant i8 1
In generic mode, this heap-to-stack conversion appears incorrect since the corresponding location is passed to parallel_51.
-------------
I ran opt built from main trunk on the corresponding .bc file (attached is the corresponding LL/txt file) but I got an assertion failure at -O1. Without OpenMPOpt (e.g. at -O0), I don't see any problem.
% opt bug51982-openmp-amdgcn-amd-amdhsa-gfx906-linked.bc -O1 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx906 -o bug51982-openmp-amdgcn-amd-amdhsa-gfx906-gfx906-optimized.bc
opt: /home/dhchakra/git/aomp-trunk/llvm-project/llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3055: virtual llvm::ChangeStatus {anonymous}::AAHeapToSharedFunction::manifest(llvm::Attributor&): Assertion `Alignment && "HeapToShared on allocation without alignment attribute"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: opt bug51982-openmp-amdgcn-amd-amdhsa-gfx906-linked.bc -O1 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx906 -o bug51982-openmp-amdgcn-amd-amdhsa-gfx906-gfx906-optimized.bc
#0 0x000055908692cec4 PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
#1 0x000055908692a664 SignalHandler(int) Signals.cpp:0:0
#2 0x00007fc89dd803c0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x143c0)
#3 0x00007fc89d81d03b raise /build/glibc-sMfBJT/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
#4 0x00007fc89d7fc859 abort /build/glibc-sMfBJT/glibc-2.31/stdlib/abort.c:81:7
#5 0x00007fc89d7fc729 get_sysdep_segment_value /build/glibc-sMfBJT/glibc-2.31/intl/loadmsgcat.c:509:8
#6 0x00007fc89d7fc729 _nl_load_domain /build/glibc-sMfBJT/glibc-2.31/intl/loadmsgcat.c:970:34
#7 0x00007fc89d80e006 (/lib/x86_64-linux-gnu/libc.so.6+0x34006)
#8 0x00005590861a4ecd (anonymous namespace)::AAHeapToSharedFunction::manifest(llvm::Attributor&) (.part.0) OpenMPOpt.cpp:0:0
#9 0x000055908604b49f llvm::Attributor::manifestAttributes() (/home/dhchakra/git/aomp-trunk/llvm-project/build/bin/opt+0x1a2649f)
#10 0x0000559086056413 llvm::Attributor::run() (/home/dhchakra/git/aomp-trunk/llvm-project/build/bin/opt+0x1a31413)
#11 0x00005590861ad1d1 (anonymous namespace)::OpenMPOpt::runAttributor(bool) (.part.0) OpenMPOpt.cpp:0:0
#12 0x00005590861b1144 (anonymous namespace)::OpenMPOpt::run(bool) OpenMPOpt.cpp:0:0
#13 0x00005590861b2010 llvm::OpenMPOptPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/dhchakra/git/aomp-trunk/llvm-project/build/bin/opt+0x1b8d010)
#14 0x0000559086c613c6 llvm::detail::PassModel<llvm::Module, llvm::OpenMPOptPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/dhchakra/git/aomp-trunk/llvm-project/build/bin/opt+0x263c3c6)
#15 0x0000559085f7b73f llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/dhchakra/git/aomp-trunk/llvm-project/build/bin/opt+0x195673f)
#16 0x0000559084cdff0b llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::StringRef>, llvm::ArrayRef<llvm::PassPlugin>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool) (/home/dhchakra/git/aomp-trunk/llvm-project/build/bin/opt+0x6baf0b)
#17 0x0000559084c5a2fc main (/home/dhchakra/git/aomp-trunk/llvm-project/build/bin/opt+0x6352fc)
#18 0x00007fc89d7fe0b3 __libc_start_main /build/glibc-sMfBJT/glibc-2.31/csu/../csu/libc-start.c:342:3
#19 0x0000559084cd31de _start (/home/dhchakra/git/aomp-trunk/llvm-project/build/bin/opt+0x6ae1de)
Aborted (core dumped)
-------------------------
I also noticed that promotion from generic to spmd is not occurring any more on the IR with the aggregate code. This should be reproducible using opt from main trunk on the attached IR. Note that main trunk clang still generates spmd mode when directly applied to this test case.
[bug51982-openmp-amdgcn-amd-amdhsa-gfx906-linked.txt](https://github.com/llvm/llvm-project/files/8383312/bug51982-openmp-amdgcn-amd-amdhsa-gfx906-linked.txt)
This is what I get from clang from the rebased patch, at -O1 and -O2, sample command-line at the top of this issue.
bug51982.c:12:1: remark: Rewriting generic-mode kernel with a customized state machine. [OMP131] [-Rpass=openmp-opt]
#pragma omp target map(tofrom : aa)
^
bug51982.c:7:7: remark: Replaced globalized variable with 4 bytes of shared memory. [OMP111] [-Rpass=openmp-opt]
int ng = 12;
^
bug51982.c:14:1: remark: Moving globalized variable to the stack. [OMP110] [-Rpass=openmp-opt]
#pragma omp parallel for
^
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJztWltznDoS_jXjFxUU92Ee_ODc6ng32biS7J5HSoBgtAZESWJs768_3RIMMLZjO8epOjm1U_YMIKkvX7darUa5KO_OG56LttdU1kxvgg-aKfwRVdUIWvKuhpt8qGN_lwZuQbgies8I9iIFVcwlG-_dxrsYv4OYbLbvYUjNkQoFyo6WQ3cNN01zaJ1eiv-yQhuivCk3QUSKhnY1cSrRs67tCVx1wmG3PZO8ZZ2mjcMVawg0fCQuDARyGlrU4tLJeYf3lgTyWij1knFjA1wpWYyi2EaHt33DC64d2VO9J87vzSZ4a2_g4scE-zEaayEdZ0nToW1ZF52TF46hGr77MclGazj2XgGdkTD84P9eUeJ89onzpacKm8f-otfEdR-j-7R_OYJMd9al_gSxTXjhB_gF_0SylsprvPrCbiTX0JnUrAMnK5xWlIxcM9mBm91wMC4lxaC0aPn_WEmUppqRlhZ73qG_x28-f7ryQ38Tv8ObBxCAlmk-hL2kdUsJSE1GbFsKmqRaVFK0BAWidBPsxgHx-9dQezv-r7TuG1qAOnUjctoYzQ5Ucpo3zCodkfwOKBNREbWnEtpb1gp5d1TZf47KhPBOEwAXmgnC_2Z6jp9X0s-P7pv1kzgYmz6gnhYmaIEhi-tZG-_FBuyppE0DTlIJeWKvH4p87kKvMY5e5EJqkB08pBCSkXJoe1bO_mG-L1EV080YDnR453uB721R7iDda92DPiDTB5zw7MDZjXJRCFdIxHLqDVSDtwBhDoG8JFwT0QFWPXrAKPpbQrvSRPtjB4wD_eCSb4AozcVhsRiQivJGEapNaMgHTRBZNj4JXPL7noM5SgZK12gstAp2xXGDZMjvEsAdgOfNHgbRvmdUKjRgDtw68C1ARQKMZM8oACwcY1RSiO7ApOICu5DPYMhPV58hFPV7s0b9xgBKcOnRDVhvSEoGNkGcVitY6gMu5N-gjXMgjqIH5mjW4giUCEaUQ4EqISfNZMtKjvFhsuO0XNwLl05d3e68BKJ1d81KNy9cNPIl0gD8AI3mDvUfFK0ZWmDJmpalXXedkit0ace6lEV3dG-zjjp-DGC0LZoNOBlIlQBUBzQxRudJcVTDjLNGnBF0oTdc8orDCKO0QY22DFZ_NcDlXtwoMvTW-1aseXcQBdXGDtXD0kIGAZMWWJbuiVsDHAFif9mR9o6w7sCl6DAFQCVwNAAW-nGKJI4mPioo0SLaypRDxgA-QhW4TSNuUGeMvCgqAE8q8EJsqnCGIVK0AivMNF1yocGt9ujzAD6BuWPGWDTuxADe3iGiljAsIugBS-oWbqBvIDdCGHNDcNgPObkR3SbYatLgcgCde4xzRzazSqwD1M3EA0auvtX3EIstgxc7n128Wy15DwwfXN3boh-gxY5cLsxPchl_QDJullDDD-4caxeQooUgGb7z4y35myiSLs3yFoI_rD1jgOslc4yX9UJp5_KLjXKKoceAe5W8qiAEdDAbwUFoK2CYgHFSobGn6ARETiIVOJebQwKPa99zP7AaozgQTHB1BtCRCMz8clqj4VFg2iDiSNVDwgBxXeNjBStCoV3a4RIwNYLAsVlFLrCLq9veFYPGuFO6tK5dWNDRc3kKPVDtTeCXeY0_QbxgGRqWBayttivZRF6WXbd9kcEzUWQ2HwFuPImIn4wL1yPUoqeogWSS1TBlM4rLYTr2AtWR6vEu_D4btyqo67mQRTOpDUt7eaDNwE4R6wVXOOeBfBiM9Ls6E11m1i9s8J7i5T-T16lshmsSjVwx2XxL_O8wiw2DnGtj-iMcEVpyzeo7VoXsWT4qmb-Q7B5FI4LJPBpedyR9jIMx8EHwcmHfKUHLYlAwnUnzElaRTCP1U8e-32vl3v7k3hGEjhUAM83RT8Cy2OxP18cL5_TKQDoBDEysFuCIxjdmL3kQn1mgLIM4k00TLssWU233fT44hUYmjxHLbiRmX_JhougP35keD9mmkowtJvJilhn3fHJaQ-o3RSbSqZsJ7pqXrveQSyfT2FyShsISvIp3677heG9y5EZACoxPl9EWY_cxAP_ioTe6F3pNXKSm82yMaQYaYO_xnUm0Nqpaae_Jz9PHZA5n-V45eHegTK9hQwFX6HYnUmKOCZnk0eoP4fKzo_s9Xj8zukevFN2jnx7do_9H979OdDeTJPiOL700JMcvCMnBgyE5WcZkZ_Wxz94ILN-Bc_JuKu8RLO-N445IHEtLWeoFmV9FQeWFWUt5lzV-kLFbVmSmLIj63TB6fc9vcMcM07szIc5flWdOWIMiJtN_tGQxlTnm6obi3bg7N09UDwDhfqI57q6Vra2UaMHFxFjX4x8AaKog4XbZ7rh4o4mpRaL2tvBjCkH3mB83t7hPWWyQ7_f8-BGreLd63M7uTC3oktRCY90E5ZZGi7HuM1aMXPI7bI_BSRcFHGDF3Nq1PbzRSy9JOW6gcRdFuzssy-QNa91fc3Ns5cW6Y4iT78NemOp-uS_29FrSZxUV8dZWUuH7G1hXwcRqscJ_efUZvufiRtH3wCf0IA4DuwOXeqANMeMhwQkv3u5pV7OvmuoBkpXtG4xMd60Y1Gb7zva4uPgNXPmb-GqyuQ9DV6A1bVtLO16Z-m06k7zQgCX4gJCbILHbWnJx9IJN4l3gaoClHmI6JPATLHmYwmNz9P6b0U_ocRgdOQBuEDm2xrOwwmSQvfr4_uLre6KGHLbrhKKtsAYmpKmrrMultkIDEaldgLqG2lTBEFmzq4eZ2gzlOFklVXuSw_TWEiLFyP6rme9YyT1mkB407a6kqCVtIVzVAyqBQvxSHosLhEe8Ww8-cbzz0mQXwNIbkSvJO23U_oY4fAUr0eY3QKthclyp7JpDbJMandIz_0fa_gltmkBqekqMd_pJQsFIaFsV6a4sUy8sPJJlWHiFRCaTY6owzp7bNMmSCMEdbp26G2xDr_eS0dJVAhe3N96tHwGVORUGNuGKTeqXXphDoOUKQ-ZU9v9QA7HCUZ-qN__4drwN3NDHF3pGC7gw70fUnSpZj542dPzWPjgYaUAyLO0jafNCJPbNC5GjJNFKEvyOd1joNZo-TxJdWjTMKMMk9c1rpSOT-JTJFuxQM51ZuTPFanTrbMpjn8UXzIn64-rcqhrmu9XPw5iRzryTh3hnXZPhwKwUZi37Uyx3W3SiMJp5btfm9ZjnJc9wnAJdJjEuE0YwZOUy6crFfRqxwiRxx6BLsAQ85h27Vwq_yMCFpEGjJ-_I6dJwMnl2KxG9KI92FXmQ-EqC6TlGynRk-oNr22TEnHfmraG2848GCYgyl8QhYKyjkRcnkR9-R1Zg99OEC33gvRLOP7F16Zf-U7ae33NMAi-NmeZCNC8yKMoRrOXIfT-KXi7HgvsT_MITfoEHdpqNchx9ZV7DzuTnLp9EOWAiaTY2C3NCrLxTXH2iHa1hMQjf3hsSvp9d_nUNnKcl6LEycLRStEj8sEgW4pZMQ1Zir1FXEJE1Dwm9UnINz6rpChYwJg-stEAw9QPwEIToVwE9SMICMF2BHi9Bj6ttvg2X4ckA_biof3PA_F2cAB4rwJIlYFFRVpWXL8QF8ojZFe-ZfYG80PCrxrdaX1i11vARxb-Z8xyf7MGZseBx2vqR55LKu8uuEpdt39zvBSHm86D7QX8wW8lXbX5EnQsp6R0-XVpq7ovGerK7gbAZarDGaX-wTaYxctrpbcT7J7cFkMd6_ce-j5dTPxt5n_p9fX9KcgresnKn7dqdYhpUBRnTr1dmHsZAe8U8PUkDmZeHkNhj1pWZMzLZSxLBQg1T6m0vbV-kY1LCMMLjZOHMfncylUIfNoKW8etrT5lfsqP2TxwUch77TPUf2CsJ0gnNi-mUBwjQCluWwWrQVMKCHbLqW1Poge5EFMUgzbttrLu0yHosFl1-mQ-EHKv0pIBFDg8LwXAFe_amxJM80wEUnpujEkgN972PVKGOtabLLy75l9DjOYxFR3uEVGneNMfzGMqKbYp4N3vWkZJjaa25w3Jbw23lzNTl5sOs06GuNy_dgePRjAfOXj2jmGDOosBvGqYhpGab-UTYS3ifHAsx3wZz-DMnqS7NARgDsAXLXCK60_Gvnupibyrz9vgW1jacz6YMqyiEZjadK0K-pmRnDgKPx8UsLzWcHAj-exzG_MXOVv7Vj0qa77PyPCx34Y6eaa4bdo6sTKaLrK6EUhAb7p5x3u9skM35j5fw4iiJo7P9eZqHLC8LSJbSMtn5XpD4Scm2cRQXu5QVwZl5X6FQzE0QTAdXAxD2jJ8HXhB4Yej5aZz6O7eIgqBMqiT2tzlL8mQTeQA9b45HMM_kuZEHDKWgseFKz-czzwBhXnfMQIL06aD3Qp6XezkcKC4jZ0b6cyP6HyrFt_E">