<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/60453>60453</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Support for `ptrtoint (ptr @pointer to i64)` on 32-bit backends
</td>
</tr>
<tr>
<th>Labels</th>
<td>
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
dmlloyd
</td>
</tr>
</table>
<pre>
This issue is to request the addition of support for lowering of constant expressions which use `ptrtoint` to lower pointer values into 64-bit integer types on 32-bit targets.
Java (specifically the JDK), not having a pointer type in the language, internally stores pointer values into 64-bit integer fields. This is because it is expected to be able to run the same classes on any CPU or OS combination, including 32-bit and 64-bit CPUs, compatibly without recompilation. This usage of 64-bit integers to hold pointer values leaks into APIs which are consumed by libraries and user code.
The project I'm working on is a native executable precompiler for Java.
Among the targets we support, WASM32 features prominently but armv7 support is also going to be important as well, and possibly i386.
Now the problem. When we lower our initial heap data as a structure, objects which contain pointers to functions or other data will fail to lower, because the pointers to these structures are stored in 64-bit integer fields. This works fine on 64-bit targets since `ptrtoint` is supported for integer values, but on 32-bit targets we get a crash which looks something like this (example is from `wasm32` CPU arch):
```
LLVM ERROR: Unsupported expression in static initializer: zext (i32 ptrtoint (ptr @foo to i32) to i64)
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /opt/compiler-explorer/clang-trunk/bin/llc -o /app/output.s -x86-asm-syntax=intel --mtriple=wasm32-wasi-unknown <source>
#0 0x0000561e544b714f llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/compiler-explorer/clang-trunk/bin/llc+0x335314f)
#1 0x0000561e544b4bc4 SignalHandler(int) Signals.cpp:0:0
#2 0x00007f6e1fc81420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
#3 0x00007f6e1f74e00b raise (/lib/x86_64-linux-gnu/libc.so.6+0x4300b)
#4 0x00007f6e1f72d859 abort (/lib/x86_64-linux-gnu/libc.so.6+0x22859)
#5 0x0000561e51a85272 llvm::UniqueStringSaver::save(llvm::StringRef) (.cold) StringSaver.cpp:0:0
#6 0x0000561e534dd465 llvm::AsmPrinter::lowerConstant(llvm::Constant const*) (/opt/compiler-explorer/clang-trunk/bin/llc+0x2379465)
#7 0x0000561e534e2587 emitGlobalConstantImpl(llvm::DataLayout const&, llvm::Constant const*, llvm::AsmPrinter&, llvm::Constant const*, unsigned long, llvm::DenseMap<unsigned long, llvm::SmallVector<llvm::GlobalAlias const*, 1u>, llvm::DenseMapInfo<unsigned long, void>, llvm::detail::DenseMapPair<unsigned long, llvm::SmallVector<llvm::GlobalAlias const*, 1u>>>*) AsmPrinter.cpp:0:0
#8 0x0000561e534e2d86 llvm::AsmPrinter::emitGlobalConstant(llvm::DataLayout const&, llvm::Constant const*, llvm::DenseMap<unsigned long, llvm::SmallVector<llvm::GlobalAlias const*, 1u>, llvm::DenseMapInfo<unsigned long, void>, llvm::detail::DenseMapPair<unsigned long, llvm::SmallVector<llvm::GlobalAlias const*, 1u>>>*) (.constprop.0) AsmPrinter.cpp:0:0
#9 0x0000561e534e3a0b llvm::AsmPrinter::emitGlobalVariable(llvm::GlobalVariable const*) (/opt/compiler-explorer/clang-trunk/bin/llc+0x237fa0b)
#10 0x0000561e52bafb8b llvm::WebAssemblyAsmPrinter::emitGlobalVariable(llvm::GlobalVariable const*) (/opt/compiler-explorer/clang-trunk/bin/llc+0x1a4bb8b)
#11 0x0000561e534dfdf9 llvm::AsmPrinter::doFinalization(llvm::Module&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x237bdf9)
#12 0x0000561e53c236c5 llvm::FPPassManager::doFinalization(llvm::Module&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x2abf6c5)
#13 0x0000561e53c2feb8 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x2acbeb8)
#14 0x0000561e51b562a3 compileModule(char**, llvm::LLVMContext&) llc.cpp:0:0
#15 0x0000561e51a91a2a main (/opt/compiler-explorer/clang-trunk/bin/llc+0x92da2a)
#16 0x00007f6e1f72f083 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24083)
#17 0x0000561e51b4defe _start (/opt/compiler-explorer/clang-trunk/bin/llc+0x9e9efe)
Compiler returned: 139
```
In this example I tried an explicit `zext` on the final integer, but the error is exactly the same if you simply do `ptrtoint (ptr @foo to i64)` on any 32-bit target.
I understand that support for *arbitrary* constant expressions is an anti-feature, but in this case I'm trying to encode an unmodified pointer (not even an offset) in a way that the backends should be able to support (verified by testing with assembly on few 32-bit targets which were able to compile a pointer in a 64-bit word without a problem). Since certain constant expressions involving symbol pointers *are* allowed (such as adding an offset), I think it would be very reasonable to also allow this case as well.
An alternative was suggested of using a structure which comprises a pair of `i32` for `i64` values (or perhaps only for `i64` values which are known to contain a pointer), however this would require a pretty major refactor of our backend in order to support two different ways to map types to LLVM, and support for endianness as well to get the word order correct, among many other auxiliary complications (n.b. I've tried twice now to implement this), so I'm hoping that could be avoided in favor of a simpler fix on the LLVM side of things.
See also: [a discussion thread I opened the other day](https://discourse.llvm.org/t/pointer-typed-globals-in-larger-than-pointer-integer-containers-fails/68072).
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsWU1z4zaQ_TX0pUsqCqRo6uCDYo93nZ3ZuMaZ5OhqEk0JMQgwAChZ-fVbDZK26HFNpvKxtYet8nhsCmy8fv0B9DN6r3aG6CpZ_5Csby6wD3vrrmSrtT3Ji8rK09XPe-VBed8TKA_BgqPfe_IBwp4ApVRBWQO2Ad93nXUBGutA2yM5ZXb8vLbGBzQB6Llz5L2yxsNxr-o99J4gKdIuuGCVCUmR8gbxZej4CTk4oO7JgzLBQpEvKhX4Z9qRg3DqyIM1kIn4PKDbUfDLJL1J0u3w_Uc8ICSi9B3VqlE1an2K0H-8-a9EbBJxDcYG2OOB4eLLtmwblIlLNZpdjzvixfFTE634YB357wHaKNLSL2HkEiqqkX3nJZ55oTqQZN8rAqw0RZ77YXePLUGt0fvBWTQnuL7_AtbBTw9Q27ZSBjkIA7xa95JdGTlBIyc01_dfPK-pbdthUJU-wVGFve0DOOKHSkc7I87e4444gHNnYg7srZZvHdeET6P72_u7KcToKGZA35KE6gRaVQ6dIh-R9Z4c1FbSLGY_7wk6Z3-jOsBdIi5bOFr3FNPJMGEI7PCBgJ6p7kMkrJtcYLqtA477Es6tbltrdpHRMU_gSFPSMi2_bh8-ZQIawtDHuDrbKkMm6BNUfQB07eHyJcsZhvYWdpZxDZFTLX_EqY5sXGs2y2521vvIt8rKYubqf9tjhNQ5W2lql_DrngwDG4rA9g6UUUGhhj1hBxIDsnUEH1xfM1LexFZM1sR5bU1AZaYAxYg1valDLD3rwIY9ucHWUWkNDSr9Unlsb0rQCO3MStiTp9etfQxvrAPJxfKttOcQemiUIY7iuHKKhFem_qoVKD-xTTLGdLI7JFzE2Yev65_p21EAhNqh34-kaGufPHjbUthzyLR6Yv-U5-5Az9h2Ona4xtmWkRzRt5lgHFxs6Oo9t4tsex69pEjHr_jrx4-_fIIPnz__9DnJtvDFvKJ_bX1Mkw8YVD0FVv1Bjtf_Qc-BsahMwEQD_94FB0meNtZyBFQmErGJPxU5I4o733_8sH34AL6vWq55qPodOIqJytUaQucZurhNxO1OhX1fLWvbJuJW68P032IsuUTcxnbvE3Ebs3foKUMyDIxWWD8Fh_VUtg8B6yeQfdu9EJQuk3Rz7-zOYQvodn1LJjAISMSt7XiXqV4X9Nxp6zjzbmvutYvgevOUiNtKmQiuhoXl97Dr-O0-dH1Yelg8l8UCfbvwJxPwOcluOEM0LBZtcKrTlGQ3QxgXR_Rq0ZsnY48Gkuza297VlGQfBrSQiCyF9DlN03RdrGid59XlKm8g8pNtk2zrT3744d4pE6LHPzMFiShfFzk8PlofHGGbiGI8LzheiSj_it-J-CF9zrJ1tsqbl2Az2NUbsHlV5_Cgdgb1f6KRmo2W497DY7-sOw5PGv-9GBKjocumoFVTl6tcpPD46ChW9aMLI3StqkTcPpfFY5EvtDL982Jn-uGDLuwdoVx6u0wj4lWei3SGN5ttc5lTmlbgUPEF4E_t12y5iJbzLE2rmeV8blnIcr0BrOx3IT-zLES53swsr885XmG5FpfiLCG-GPV7Tw-B7zkPeIhFzHmCh3lODCs-UzPmwbK2Wsa4vL76fmyKcwBZLmVerM8AbH0bc3HaOXbv6_G2NYMwPRzuYonY_t2UFNnlJi_WM7ou52hJrMtLoFaF_9C2Qj1huGs7PQN3gwE_4omvISO8WDjfRH_9Pg3f9WZv4qVXgrZmN19_Q8bTJ-yS7Pobqx5a1PoXqoN1SXb9-nzwc6sV-tmGq577zLsb3ZnGvrfZwSr51UuSAio9N3CPyv0baIevIVFeCX4_Tcu3gZdl8Y00_Tol_tF0-P8YvhfDoe8YHzpnO-7Sfx7WzZuwZphW3xXWX9ApvpPPwjr_6J9sRA2enQh8NM4OclFhU5XnwH-laus9tZU-_d_wYYV5VZVzH1ZvWn8jm803yJf2Vhm-So6D4BnoT1b27EfxD1BdyWYzgylmMGuRFfX5CXV7f4_ef0KDu_9VpFg1Rb2eIc3eIG2oKs-QatphfRoveK-Y42E13Oz6fw1sXVFVzsDms5tHtS4EZjAanjYu6z26mH1vGiBPIdfWBHoOIzyt669LnPd5c8PZrFAgtDw6_h2PNkKiwJlDxZtLWpOWGTw-8g3s0Qd04fFs1--9sOVpmc12uZzTlktqCAb7f88h2lBDL1tdTzqDo9A7Q5KnmlW2eXcmHL7fmWHUnObMOwhOkQQ0PBpqVavAMyePgDxx2kH5abhSpql3Gnf5A3KOx-Forw6joBV1ItXAyfbgVdvpE0h7PlO_M0wOI-SwI5rTfJCeCRV30BtJjk9eCWGPYab3JWKLrlLBoTslYvu-5qc8-4smqMUoskw-qZGeGj2Nik9wp1FYIVNbSfxqb1orVcPETdpTIkpjA9CB2DLYpvEUhx5lAOGIpwEr88NDKxnpwe9tr-W51Da5kojyQG7YoTpBIB8YxFGFPeB4ZjBVDR2_0hyiyHAk92p1TLMzUTGCGqWPo3XyRX_DSQBKxGYJD1ENqclFFed9Ls3B6ihZ-lNbWf2q0sRQEAcBNU8EMqqffb2PspGM2uA5UxyDO6bfPEGENXJzIHcCR-itmRyKgle0ehauUeqaJcvWAOqok0aZ7ogefL_bkQ8kwTbQ-0FsfRGSXoSrtnPKk2dCUDlemxSpGnSYmGhFyklbpJPqmIjSOujI7bHzYI0-vb_uVY0cNIAYn0Eme4nPyMbeHulAbvBx4MPR771yMZSOQjhBi79ZbgAN8qWMcdreTSnGcbZOsonX3ApHC1I1DTkygTMzCmotdqOMHSxw554kw_PqIiMVGkPeT2zz6h0NeR0Tadiuts5FBecaMKqdLRf1oPZh_6y0QneKLGtV4yAHcgUtq2UsuwONfSkcVU1gONAWuJdQy6iZkZEkb8dC3dsuFiqXWf1SWHz5HTTBBg8DQTh0pagLPk8tLkpmXskoNEdhbi7fPxDFtIvC0foHBKl83Q9K2qA6wB3YjvjezPYmZfOUrG8SUc6VL37X9s7Tkk_LpXW7RNzygTDGf8GRkItdvOb5hTILzdXtFmGPZjEtGvvxYkwfcn7RoNI-EbdFmV4KLuELeZXJTbbBC7paFZdrUaxFUV7sr2RZrWhTVGV5meaYy2ZVFZcoVs1qk6e5FBfqSqQiS0W6Wq3FZV4uq1LUxbqWhSjkqsI6yVNqUekXHy6iWndVpPk6u9BYkfbT33PcVRT2qn7nkzzVygf_-lpQQdPVw3kXf_esePmLyJvzYmyAU1u96J2--utKY4T_PwEAAP__ckSibQ">