[Openmp-commits] [PATCH] D138689: [openmp] Support building for armv7 Windows with mingw tools

Martin Storsjö via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Fri Nov 25 03:39:34 PST 2022


mstorsjo added a comment.

In D138689#3950527 <https://reviews.llvm.org/D138689#3950527>, @DavidSpickett wrote:

>> (The version in z_Linux_util.cpp only handles up to 15 arguments. And contrary to aarch64, z_Linux_asm.S doesn't have any assembly for 32 bit arm.)
>
> So this means that simply extending it to handle 17 arguments wouldn't help because it would want to use the assembly version of the other functions?

I'm not quite sure I understand the question...

The assembly versions of `__kmp_invoke_microtask` can handle arbitrary numbers of arguments, since they can do the passing of variable numbers of arguments on the stack correctly. The C version of the function, in the MSVC/arm64 case, also handles it for any arbitrary number of arguments by doing a brittle trick with `alloca`, here: https://github.com/llvm/llvm-project/blob/main/openmp/runtime/src/z_Windows_NT-586_util.cpp#L170-L182

  default: {
    // p_argv[6] and onwards must be passed on the stack since 8 registers are
    // already used.
    size_t len = (argc - 6) * sizeof(void *);
    void *argbuf = alloca(len);
    memcpy(argbuf, &p_argv[6], len);
  }
    [[fallthrough]];
  case 6:
    (*pkfn)(&gtid, &tid, p_argv[0], p_argv[1], p_argv[2], p_argv[3], p_argv[4],
            p_argv[5]);
    break;
  }

This brittle use of `alloca` and `memcpy` only works on MSVC - Clang optimizes it out altogether, hence D137827 <https://reviews.llvm.org/D137827> where we used the assembly version of `__kmp_invoke_microtask` on Windows/aarch64 with Clang too.

For ARM (on existing handled platforms like Linux), it uses a fallback C version from z_Linux_util.cpp: https://github.com/llvm/llvm-project/blob/main/openmp/runtime/src/z_Linux_util.cpp#L2443-L2532

This is hardcoded to handle only up to 15 arguments (and errors out on any higher number). Extending it to handle up to 17, or any number, like in this patch would be trivial. That would fix the https://github.com/llvm/llvm-project/blob/main/openmp/runtime/test/misc_bugs/many-microtask-args.c testcase for arm linux too. But that doesn't fix the root issue that it only handles up to a specific number of arguments. (The comment in https://github.com/llvm/llvm-project/blob/main/openmp/runtime/src/z_Linux_util.cpp#L2447-L2448 which claims that only one argument is needed is wrong - Clang currently doesn't bundle up arguments in a struct.)

I thought this patch would be the simplest way of getting this new build configuration to work (plus achieving the "all tests passing" status).

The proper solution would be to write an assembly version of `__kmp_invoke_microtask` for ARM too. I actually already did that and have that available locally, but I thought that might be harder to get reviewed/approved (I'm not sure if the runtime is meant to support older ARM architectures than the armv7 I'm used to writing assembly for, etc), so I went with this patch first. I could also reverse them, first post a patch to switch ARM Linux OpenMP over to an assembly version of the function, then layer this patch on top, to use that assembly function for Windows too, just like for aarch64.



================
Comment at: openmp/runtime/src/kmp_atomic.cpp:866
 
-#endif // KMP_OS_WINDOWS && KMP_ARCH_AARCH64
+#endif // KMP_OS_WINDOWS && (KMP_ARCH_AARCH64 || KMP_ARCH_ARM)
 
----------------
DavidSpickett wrote:
> Is the `!` correct here?
Oh, oops, thanks for catching. Actually, this isn't an `!`, it's an UTF8 non-breaking space. My keyboard layout produces that for alt+space, which is fairly easy to type accidentally after using alt for the `|`.


================
Comment at: openmp/runtime/src/z_Windows_NT_util.cpp:81
 KMP_BUILD_ASSERT(offsetof(SYSTEM_THREAD, KernelTime) == 0);
-#if KMP_ARCH_X86
+#if KMP_ARCH_X86 || KMP_ARCH_ARM
 KMP_BUILD_ASSERT(offsetof(SYSTEM_THREAD, StartAddress) == 28);
----------------
DavidSpickett wrote:
> Does X86 mean 32 bit here?
Yes, exactly. Ideally this should probably be specifically "#if BITNESS == 32` or something like that, maybe `#ifndef _WIN64` or so. But I'm sticking with the existing code style for now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138689/new/

https://reviews.llvm.org/D138689



More information about the Openmp-commits mailing list