[Openmp-commits] [PATCH] D137828: RFC: [openmp] Don't assume a specific layout for alloca() in Windows arm64 __kmp_invoke_microtask

Martin Storsjö via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Fri Nov 11 09:17:20 PST 2022


mstorsjo added inline comments.


================
Comment at: openmp/runtime/src/z_Windows_NT-586_util.cpp:152
   switch (argc) {
+  default:
+    fprintf(stderr, "Too many args to microtask: %d!\n", argc);
----------------
pulidocr wrote:
> Is there a fallback path for ARM64 MSVC? If there isn't, then our libomp140.aarch64.dll might break. @natgla FYI
No, there's no fallback path to that. So yes, this would break your libomp builds for the cases if passing more than 15 arguments here.

But the main point is that the `alloca` + `memcpy` construct is extremely brittle here - I wouldn't even be sure that an update of MSVC could break it (since if you look at it from a compiler point of view, the `memcpy` is a dead write to a buffer which is not referenced anywhere).

As far as I know, the only properly robust solution, for an arbitrary number of arguments, is to write this function in assembly (like I do with gnu assembly in D137827). I guess it's possible to port the existing aarch assembly function to MS armasm64 syntax too (but I don't know if cmake knows how to interact with that tool).



================
Comment at: openmp/runtime/src/z_Windows_NT-586_util.cpp:174
-    size_t len = (argc - 6) * sizeof(void *);
-    void *argbuf = alloca(len);
-    memcpy(argbuf, &p_argv[6], len);
----------------
natgla wrote:
> argbuf better be volatile void*, to prevent optimizer messing with it
Yes, possibly. With Clang I managed to make it not optimize it out, by adding e.g. `__asm__ __ volatile__("" ::"r"(argbuf))`, which passes the buffer pointer to an opaque inline assembly snippet (so the compiler can't assume anything about it) which does nothing.

But even then, there's no strict guarantee that the buffer actually is at the exact bottom of the stack frame - there could concievably be some extra space at the bottom of the frame (e.g. if the compiler is saving space for outgoing parameters, if there is a function call with more than 8 parameters somewhere).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137828/new/

https://reviews.llvm.org/D137828



More information about the Openmp-commits mailing list