[Openmp-commits] [openmp] Add openmp support to System z (PR #66081)
Neale Ferguson via Openmp-commits
openmp-commits at lists.llvm.org
Wed Nov 1 05:02:46 PDT 2023
https://github.com/nealef updated https://github.com/llvm/llvm-project/pull/66081
>From 88b6acc4695537f7126f643bb2feda3495bba6f6 Mon Sep 17 00:00:00 2001
From: Neale Ferguson <neale at sinenomine.net>
Date: Tue, 12 Sep 2023 08:37:07 -0400
Subject: [PATCH] Add openmp support to System z
* openmp/README.rst
- Add s390x to those platforms supported
* openmp/libomptarget/plugins-nextgen/CMakeLists.txt
- Add s390x subdirectory
* openmp/libomptarget/plugins-nextgen/s390x/CMakeLists.txt
- Add s390x definitions
* openmp/runtime/CMakeLists.txt
- Add s390x to those platforms supported
* openmp/runtime/cmake/LibompGetArchitecture.cmake
- Define s390x ARCHITECTURE
* openmp/runtime/cmake/LibompMicroTests.cmake
- Add dependencies for System z (aka s390x)
* openmp/runtime/cmake/LibompUtils.cmake
- Add S390X to the mix
* openmp/runtime/cmake/config-ix.cmake
- Add s390x as a supported LIPOMP_ARCH
* openmp/runtime/src/kmp_affinity.h
- Define __NR_sched_[get|set]addinity for s390x
* openmp/runtime/src/kmp_config.h.cmake
- Define CACHE_LINE for s390x
* openmp/runtime/src/kmp_os.h
- Add KMP_ARCH_S390X to support checks
* openmp/runtime/src/kmp_platform.h
- Define KMP_ARCH_S390X
* openmp/runtime/src/kmp_runtime.cpp
- Generate code when KMP_ARCH_S390X is defined
* openmp/runtime/src/kmp_tasking.cpp
- Generate code when KMP_ARCH_S390X is defined
* openmp/runtime/src/thirdparty/ittnotify/ittnotify_config.h
- Define ITT_ARCH_S390X
* openmp/runtime/src/z_Linux_asm.S
- Instantiate __kmp_invoke_microtask for s390x
* openmp/runtime/src/z_Linux_util.cpp
- Generate code when KMP_ARCH_S390X is defined
* openmp/runtime/test/ompt/callback.h
- Define print_possible_return_addresses for s390x
* openmp/runtime/tools/lib/Platform.pm
- Return s390x as platform and host architecture
* openmp/runtime/tools/lib/Uname.pm
- Set hardware platform value for s390x
* openmp/runtime/src/kmp_affinity.cpp
- Implement s390x /proc/cpuinfo parsing
* openmp/runtime/src/kmp_tasking.cpp
- Add backchain attribute to __km_invoke_task
- Style fix
* openmp/runtime/src/z_Linux_asm.S
- Add unwind information
* openmp/runtime/test/lit.cfg
- Build openmp tests with -mbackchain
* openmp/runtime/test/ompt/callback.h
- Additional possibility for print_possible_return_addresses()
---
openmp/README.rst | 4 +-
.../plugins-nextgen/CMakeLists.txt | 1 +
.../plugins-nextgen/s390x/CMakeLists.txt | 17 ++
openmp/runtime/CMakeLists.txt | 9 +-
.../runtime/cmake/LibompGetArchitecture.cmake | 2 +
openmp/runtime/cmake/LibompMicroTests.cmake | 3 +
openmp/runtime/cmake/LibompUtils.cmake | 2 +
openmp/runtime/cmake/config-ix.cmake | 3 +-
openmp/runtime/src/kmp_affinity.cpp | 33 ++++
openmp/runtime/src/kmp_affinity.h | 13 +-
openmp/runtime/src/kmp_config.h.cmake | 2 +
openmp/runtime/src/kmp_os.h | 6 +-
openmp/runtime/src/kmp_platform.h | 7 +-
openmp/runtime/src/kmp_runtime.cpp | 3 +-
openmp/runtime/src/kmp_tasking.cpp | 10 +-
.../thirdparty/ittnotify/ittnotify_config.h | 6 +
openmp/runtime/src/z_Linux_asm.S | 159 +++++++++++++++++-
openmp/runtime/src/z_Linux_util.cpp | 2 +-
openmp/runtime/test/lit.cfg | 2 +
openmp/runtime/test/ompt/callback.h | 16 ++
openmp/runtime/tools/lib/Platform.pm | 6 +-
openmp/runtime/tools/lib/Uname.pm | 2 +
22 files changed, 291 insertions(+), 17 deletions(-)
create mode 100644 openmp/libomptarget/plugins-nextgen/s390x/CMakeLists.txt
diff --git a/openmp/README.rst b/openmp/README.rst
index bb9443df56d7656..0e4916f44c68287 100644
--- a/openmp/README.rst
+++ b/openmp/README.rst
@@ -141,7 +141,7 @@ Options for all Libraries
Options for ``libomp``
----------------------
-**LIBOMP_ARCH** = ``aarch64|arm|i386|loongarch64|mic|mips|mips64|ppc64|ppc64le|x86_64|riscv64``
+**LIBOMP_ARCH** = ``aarch64|arm|i386|loongarch64|mic|mips|mips64|ppc64|ppc64le|x86_64|riscv64|s390x``
The default value for this option is chosen based on probing the compiler for
architecture macros (e.g., is ``__x86_64__`` predefined by compiler?).
@@ -198,7 +198,7 @@ Optional Features
**LIBOMP_OMPT_SUPPORT** = ``ON|OFF``
Include support for the OpenMP Tools Interface (OMPT).
This option is supported and ``ON`` by default for x86, x86_64, AArch64,
- PPC64, RISCV64 and LoongArch64 on Linux* and macOS*.
+ PPC64, RISCV64, LoongArch64, and s390x on Linux* and macOS*.
This option is ``OFF`` if this feature is not supported for the platform.
**LIBOMP_OMPT_OPTIONAL** = ``ON|OFF``
diff --git a/openmp/libomptarget/plugins-nextgen/CMakeLists.txt b/openmp/libomptarget/plugins-nextgen/CMakeLists.txt
index 9b4f4a5866e7987..d81e5d37d7c08df 100644
--- a/openmp/libomptarget/plugins-nextgen/CMakeLists.txt
+++ b/openmp/libomptarget/plugins-nextgen/CMakeLists.txt
@@ -96,6 +96,7 @@ add_subdirectory(cuda)
add_subdirectory(ppc64)
add_subdirectory(ppc64le)
add_subdirectory(x86_64)
+add_subdirectory(s390x)
# Make sure the parent scope can see the plugins that will be created.
set(LIBOMPTARGET_SYSTEM_TARGETS "${LIBOMPTARGET_SYSTEM_TARGETS}" PARENT_SCOPE)
diff --git a/openmp/libomptarget/plugins-nextgen/s390x/CMakeLists.txt b/openmp/libomptarget/plugins-nextgen/s390x/CMakeLists.txt
new file mode 100644
index 000000000000000..1b12a292899980e
--- /dev/null
+++ b/openmp/libomptarget/plugins-nextgen/s390x/CMakeLists.txt
@@ -0,0 +1,17 @@
+##===----------------------------------------------------------------------===##
+#
+# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+#
+##===----------------------------------------------------------------------===##
+#
+# Build a plugin for a s390x machine if available.
+#
+##===----------------------------------------------------------------------===##
+
+if(CMAKE_SYSTEM_NAME MATCHES "Linux")
+ build_generic_elf64("SystemZ" "S390X" "s390x" "s390x-ibm-linux-gnu" "22")
+else()
+ libomptarget_say("Not building s390x NextGen offloading plugin: machine not found in the system.")
+endif()
diff --git a/openmp/runtime/CMakeLists.txt b/openmp/runtime/CMakeLists.txt
index 4441c4babdc07c0..8a913989272c4c5 100644
--- a/openmp/runtime/CMakeLists.txt
+++ b/openmp/runtime/CMakeLists.txt
@@ -30,7 +30,7 @@ if(${OPENMP_STANDALONE_BUILD})
# If adding a new architecture, take a look at cmake/LibompGetArchitecture.cmake
libomp_get_architecture(LIBOMP_DETECTED_ARCH)
set(LIBOMP_ARCH ${LIBOMP_DETECTED_ARCH} CACHE STRING
- "The architecture to build for (x86_64/i386/arm/ppc64/ppc64le/aarch64/mic/mips/mips64/riscv64/loongarch64/ve).")
+ "The architecture to build for (x86_64/i386/arm/ppc64/ppc64le/aarch64/mic/mips/mips64/riscv64/loongarch64/ve/s390x).")
# Should assertions be enabled? They are on by default.
set(LIBOMP_ENABLE_ASSERTIONS TRUE CACHE BOOL
"enable assertions?")
@@ -65,6 +65,8 @@ else() # Part of LLVM build
set(LIBOMP_ARCH loongarch64)
elseif(LIBOMP_NATIVE_ARCH MATCHES "ve")
set(LIBOMP_ARCH ve)
+ elseif(LIBOMP_NATIVE_ARCH MATCHES "s390x")
+ set(LIBOMP_ARCH s390x)
else()
# last ditch effort
libomp_get_architecture(LIBOMP_ARCH)
@@ -85,7 +87,7 @@ if(LIBOMP_ARCH STREQUAL "aarch64")
endif()
endif()
-libomp_check_variable(LIBOMP_ARCH 32e x86_64 32 i386 arm ppc64 ppc64le aarch64 aarch64_a64fx mic mips mips64 riscv64 loongarch64 ve)
+libomp_check_variable(LIBOMP_ARCH 32e x86_64 32 i386 arm ppc64 ppc64le aarch64 aarch64_a64fx mic mips mips64 riscv64 loongarch64 ve s390x)
set(LIBOMP_LIB_TYPE normal CACHE STRING
"Performance,Profiling,Stubs library (normal/profile/stubs)")
@@ -165,6 +167,7 @@ set(MIPS FALSE)
set(RISCV64 FALSE)
set(LOONGARCH64 FALSE)
set(VE FALSE)
+set(S390X FALSE)
if("${LIBOMP_ARCH}" STREQUAL "i386" OR "${LIBOMP_ARCH}" STREQUAL "32") # IA-32 architecture
set(IA32 TRUE)
elseif("${LIBOMP_ARCH}" STREQUAL "x86_64" OR "${LIBOMP_ARCH}" STREQUAL "32e") # Intel(R) 64 architecture
@@ -193,6 +196,8 @@ elseif("${LIBOMP_ARCH}" STREQUAL "loongarch64") # LoongArch64 architecture
set(LOONGARCH64 TRUE)
elseif("${LIBOMP_ARCH}" STREQUAL "ve") # VE architecture
set(VE TRUE)
+elseif("${LIBOMP_ARCH}" STREQUAL "s390x") # S390x (Z) architecture
+ set(S390X TRUE)
endif()
# Set some flags based on build_type
diff --git a/openmp/runtime/cmake/LibompGetArchitecture.cmake b/openmp/runtime/cmake/LibompGetArchitecture.cmake
index 98bfce9ae990a7b..9d4f08b92de50d5 100644
--- a/openmp/runtime/cmake/LibompGetArchitecture.cmake
+++ b/openmp/runtime/cmake/LibompGetArchitecture.cmake
@@ -51,6 +51,8 @@ function(libomp_get_architecture return_arch)
#error ARCHITECTURE=loongarch64
#elif defined(__ve__)
#error ARCHITECTURE=ve
+ #elif defined(__s390x__)
+ #error ARCHITECTURE=s390x
#else
#error ARCHITECTURE=UnknownArchitecture
#endif
diff --git a/openmp/runtime/cmake/LibompMicroTests.cmake b/openmp/runtime/cmake/LibompMicroTests.cmake
index 88deb461dbaf3a2..e8cc218af0c294f 100644
--- a/openmp/runtime/cmake/LibompMicroTests.cmake
+++ b/openmp/runtime/cmake/LibompMicroTests.cmake
@@ -217,6 +217,9 @@ else()
elseif(${LOONGARCH64})
libomp_append(libomp_expected_library_deps libc.so.6)
libomp_append(libomp_expected_library_deps ld.so.1)
+ elseif(${S390X})
+ libomp_append(libomp_expected_library_deps libc.so.6)
+ libomp_append(libomp_expected_library_deps ld.so.1)
endif()
libomp_append(libomp_expected_library_deps libpthread.so.0 IF_FALSE STUBS_LIBRARY)
libomp_append(libomp_expected_library_deps libhwloc.so.5 LIBOMP_USE_HWLOC)
diff --git a/openmp/runtime/cmake/LibompUtils.cmake b/openmp/runtime/cmake/LibompUtils.cmake
index 0151ca0ea826bd7..139eabb45c54f74 100644
--- a/openmp/runtime/cmake/LibompUtils.cmake
+++ b/openmp/runtime/cmake/LibompUtils.cmake
@@ -113,6 +113,8 @@ function(libomp_get_legal_arch return_arch_string)
set(${return_arch_string} "LOONGARCH64" PARENT_SCOPE)
elseif(${VE})
set(${return_arch_string} "VE" PARENT_SCOPE)
+ elseif(${S390X})
+ set(${return_arch_string} "S390X" PARENT_SCOPE)
else()
set(${return_arch_string} "${LIBOMP_ARCH}" PARENT_SCOPE)
libomp_warning_say("libomp_get_legal_arch(): Warning: Unknown architecture: Using ${LIBOMP_ARCH}")
diff --git a/openmp/runtime/cmake/config-ix.cmake b/openmp/runtime/cmake/config-ix.cmake
index 9869aeab0354635..d54d350816926df 100644
--- a/openmp/runtime/cmake/config-ix.cmake
+++ b/openmp/runtime/cmake/config-ix.cmake
@@ -325,7 +325,8 @@ else()
(LIBOMP_ARCH STREQUAL ppc64le) OR
(LIBOMP_ARCH STREQUAL ppc64) OR
(LIBOMP_ARCH STREQUAL riscv64) OR
- (LIBOMP_ARCH STREQUAL loongarch64))
+ (LIBOMP_ARCH STREQUAL loongarch64) OR
+ (LIBOMP_ARCH STREQUAL s390x))
AND # OS supported?
((WIN32 AND LIBOMP_HAVE_PSAPI) OR APPLE OR (NOT WIN32 AND LIBOMP_HAVE_WEAK_ATTRIBUTE)))
set(LIBOMP_HAVE_OMPT_SUPPORT TRUE)
diff --git a/openmp/runtime/src/kmp_affinity.cpp b/openmp/runtime/src/kmp_affinity.cpp
index 20c1c610b9159e0..8c608d78bb56fe1 100644
--- a/openmp/runtime/src/kmp_affinity.cpp
+++ b/openmp/runtime/src/kmp_affinity.cpp
@@ -2990,6 +2990,9 @@ static bool __kmp_affinity_create_cpuinfo_map(int *line,
unsigned num_avail = 0;
*line = 0;
+#if KMP_ARCH_S390X
+ bool reading_s390x_sys_info = true;
+#endif
while (!feof(f)) {
// Create an inner scoping level, so that all the goto targets at the end of
// the loop appear in an outer scoping level. This avoids warnings about
@@ -3035,8 +3038,21 @@ static bool __kmp_affinity_create_cpuinfo_map(int *line,
if (*buf == '\n' && *line == 2)
continue;
#endif
+#if KMP_ARCH_S390X
+ // s390x /proc/cpuinfo starts with a variable number of lines containing
+ // the overall system information. Skip them.
+ if (reading_s390x_sys_info) {
+ if (*buf == '\n')
+ reading_s390x_sys_info = false;
+ continue;
+ }
+#endif
+#if KMP_ARCH_S390X
+ char s1[] = "cpu number";
+#else
char s1[] = "processor";
+#endif
if (strncmp(buf, s1, sizeof(s1) - 1) == 0) {
CHECK_LINE;
char *p = strchr(buf + sizeof(s1) - 1, ':');
@@ -3062,6 +3078,23 @@ static bool __kmp_affinity_create_cpuinfo_map(int *line,
threadInfo[num_avail][osIdIndex]);
__kmp_read_from_file(path, "%u", &threadInfo[num_avail][pkgIdIndex]);
+#if KMP_ARCH_S390X
+ // Disambiguate physical_package_id.
+ unsigned book_id;
+ KMP_SNPRINTF(path, sizeof(path),
+ "/sys/devices/system/cpu/cpu%u/topology/book_id",
+ threadInfo[num_avail][osIdIndex]);
+ __kmp_read_from_file(path, "%u", &book_id);
+ threadInfo[num_avail][pkgIdIndex] |= (book_id << 8);
+
+ unsigned drawer_id;
+ KMP_SNPRINTF(path, sizeof(path),
+ "/sys/devices/system/cpu/cpu%u/topology/drawer_id",
+ threadInfo[num_avail][osIdIndex]);
+ __kmp_read_from_file(path, "%u", &drawer_id);
+ threadInfo[num_avail][pkgIdIndex] |= (drawer_id << 16);
+#endif
+
KMP_SNPRINTF(path, sizeof(path),
"/sys/devices/system/cpu/cpu%u/topology/core_id",
threadInfo[num_avail][osIdIndex]);
diff --git a/openmp/runtime/src/kmp_affinity.h b/openmp/runtime/src/kmp_affinity.h
index 97808b528538097..bd4b74cd7c3f3cd 100644
--- a/openmp/runtime/src/kmp_affinity.h
+++ b/openmp/runtime/src/kmp_affinity.h
@@ -291,12 +291,23 @@ class KMPHwlocAffinity : public KMPAffinity {
#define __NR_sched_setaffinity 203
#elif __NR_sched_setaffinity != 203
#error Wrong code for setaffinity system call.
-#endif /* __NR_sched_setaffinity */
+#endif /* __NR_sched_getaffinity */
#ifndef __NR_sched_getaffinity
#define __NR_sched_getaffinity 204
#elif __NR_sched_getaffinity != 204
#error Wrong code for getaffinity system call.
#endif /* __NR_sched_getaffinity */
+#elif KMP_ARCH_S390X
+#ifndef __NR_sched_setaffinity
+#define __NR_sched_setaffinity 239
+#elif __NR_sched_setaffinity != 239
+#error Wrong code for setaffinity system call.
+#endif /* __NR_sched_setaffinity */
+#ifndef __NR_sched_getaffinity
+#define __NR_sched_getaffinity 240
+#elif __NR_sched_getaffinity != 240
+#error Wrong code for getaffinity system call.
+#endif /* __NR_sched_getaffinity */
#else
#error Unknown or unsupported architecture
#endif /* KMP_ARCH_* */
diff --git a/openmp/runtime/src/kmp_config.h.cmake b/openmp/runtime/src/kmp_config.h.cmake
index 58bf64112b1a7a7..5f04301c91c60cd 100644
--- a/openmp/runtime/src/kmp_config.h.cmake
+++ b/openmp/runtime/src/kmp_config.h.cmake
@@ -104,6 +104,8 @@
# define CACHE_LINE 128
#elif KMP_ARCH_AARCH64_A64FX
# define CACHE_LINE 256
+#elif KMP_ARCH_S390X
+# define CACHE_LINE 256
#else
# define CACHE_LINE 64
#endif
diff --git a/openmp/runtime/src/kmp_os.h b/openmp/runtime/src/kmp_os.h
index 2c632112a8d8e35..ca694f6f14933cd 100644
--- a/openmp/runtime/src/kmp_os.h
+++ b/openmp/runtime/src/kmp_os.h
@@ -178,7 +178,8 @@ typedef unsigned long long kmp_uint64;
#if KMP_ARCH_X86 || KMP_ARCH_ARM || KMP_ARCH_MIPS
#define KMP_SIZE_T_SPEC KMP_UINT32_SPEC
#elif KMP_ARCH_X86_64 || KMP_ARCH_PPC64 || KMP_ARCH_AARCH64 || \
- KMP_ARCH_MIPS64 || KMP_ARCH_RISCV64 || KMP_ARCH_LOONGARCH64 || KMP_ARCH_VE
+ KMP_ARCH_MIPS64 || KMP_ARCH_RISCV64 || KMP_ARCH_LOONGARCH64 || \
+ KMP_ARCH_VE || KMP_ARCH_S390X
#define KMP_SIZE_T_SPEC KMP_UINT64_SPEC
#else
#error "Can't determine size_t printf format specifier."
@@ -1043,7 +1044,8 @@ extern kmp_real64 __kmp_xchg_real64(volatile kmp_real64 *p, kmp_real64 v);
#endif /* KMP_OS_WINDOWS */
#if KMP_ARCH_PPC64 || KMP_ARCH_ARM || KMP_ARCH_AARCH64 || KMP_ARCH_MIPS || \
- KMP_ARCH_MIPS64 || KMP_ARCH_RISCV64 || KMP_ARCH_LOONGARCH64 || KMP_ARCH_VE
+ KMP_ARCH_MIPS64 || KMP_ARCH_RISCV64 || KMP_ARCH_LOONGARCH64 || \
+ KMP_ARCH_VE || KMP_ARCH_S390X
#if KMP_OS_WINDOWS
#undef KMP_MB
#define KMP_MB() std::atomic_thread_fence(std::memory_order_seq_cst)
diff --git a/openmp/runtime/src/kmp_platform.h b/openmp/runtime/src/kmp_platform.h
index 1a2197d338342ac..70e55b427dc9205 100644
--- a/openmp/runtime/src/kmp_platform.h
+++ b/openmp/runtime/src/kmp_platform.h
@@ -94,6 +94,7 @@
#define KMP_ARCH_RISCV64 0
#define KMP_ARCH_LOONGARCH64 0
#define KMP_ARCH_VE 0
+#define KMP_ARCH_S390X 0
#if KMP_OS_WINDOWS
#if defined(_M_AMD64) || defined(__x86_64)
@@ -146,6 +147,9 @@
#elif defined __ve__
#undef KMP_ARCH_VE
#define KMP_ARCH_VE 1
+#elif defined __s390x__
+#undef KMP_ARCH_S390X
+#define KMP_ARCH_S390X 1
#endif
#endif
@@ -210,7 +214,8 @@
// TODO: Fixme - This is clever, but really fugly
#if (1 != KMP_ARCH_X86 + KMP_ARCH_X86_64 + KMP_ARCH_ARM + KMP_ARCH_PPC64 + \
KMP_ARCH_AARCH64 + KMP_ARCH_MIPS + KMP_ARCH_MIPS64 + \
- KMP_ARCH_RISCV64 + KMP_ARCH_LOONGARCH64 + KMP_ARCH_VE)
+ KMP_ARCH_RISCV64 + KMP_ARCH_LOONGARCH64 + KMP_ARCH_VE + \
+ KMP_ARCH_S390X)
#error Unknown or unsupported architecture
#endif
diff --git a/openmp/runtime/src/kmp_runtime.cpp b/openmp/runtime/src/kmp_runtime.cpp
index e83c09383769a51..ec0aef1ab12926b 100644
--- a/openmp/runtime/src/kmp_runtime.cpp
+++ b/openmp/runtime/src/kmp_runtime.cpp
@@ -8894,7 +8894,8 @@ __kmp_determine_reduction_method(
int atomic_available = FAST_REDUCTION_ATOMIC_METHOD_GENERATED;
#if KMP_ARCH_X86_64 || KMP_ARCH_PPC64 || KMP_ARCH_AARCH64 || \
- KMP_ARCH_MIPS64 || KMP_ARCH_RISCV64 || KMP_ARCH_LOONGARCH64 || KMP_ARCH_VE
+ KMP_ARCH_MIPS64 || KMP_ARCH_RISCV64 || KMP_ARCH_LOONGARCH64 || \
+ KMP_ARCH_VE || KMP_ARCH_S390X
#if KMP_OS_LINUX || KMP_OS_DRAGONFLY || KMP_OS_FREEBSD || KMP_OS_NETBSD || \
KMP_OS_OPENBSD || KMP_OS_WINDOWS || KMP_OS_DARWIN || KMP_OS_HURD
diff --git a/openmp/runtime/src/kmp_tasking.cpp b/openmp/runtime/src/kmp_tasking.cpp
index e8eb6b02650377c..3c35528f08b8812 100644
--- a/openmp/runtime/src/kmp_tasking.cpp
+++ b/openmp/runtime/src/kmp_tasking.cpp
@@ -1554,7 +1554,7 @@ kmp_task_t *__kmp_task_alloc(ident_t *loc_ref, kmp_int32 gtid,
task = KMP_TASKDATA_TO_TASK(taskdata);
// Make sure task & taskdata are aligned appropriately
-#if KMP_ARCH_X86 || KMP_ARCH_PPC64 || !KMP_HAVE_QUAD
+#if KMP_ARCH_X86 || KMP_ARCH_PPC64 || KMP_ARCH_S390X || !KMP_HAVE_QUAD
KMP_DEBUG_ASSERT((((kmp_uintptr_t)taskdata) & (sizeof(double) - 1)) == 0);
KMP_DEBUG_ASSERT((((kmp_uintptr_t)task) & (sizeof(double) - 1)) == 0);
#else
@@ -1737,8 +1737,12 @@ __kmpc_omp_reg_task_with_affinity(ident_t *loc_ref, kmp_int32 gtid,
// gtid: global thread ID of caller
// task: the task to invoke
// current_task: the task to resume after task invocation
-static void __kmp_invoke_task(kmp_int32 gtid, kmp_task_t *task,
- kmp_taskdata_t *current_task) {
+#ifdef __s390x__
+__attribute__((target("backchain")))
+#endif
+static void
+__kmp_invoke_task(kmp_int32 gtid, kmp_task_t *task,
+ kmp_taskdata_t *current_task) {
kmp_taskdata_t *taskdata = KMP_TASK_TO_TASKDATA(task);
kmp_info_t *thread;
int discard = 0 /* false */;
diff --git a/openmp/runtime/src/thirdparty/ittnotify/ittnotify_config.h b/openmp/runtime/src/thirdparty/ittnotify/ittnotify_config.h
index ff37eb4ed175e67..bd3fd9b43e574d1 100644
--- a/openmp/runtime/src/thirdparty/ittnotify/ittnotify_config.h
+++ b/openmp/runtime/src/thirdparty/ittnotify/ittnotify_config.h
@@ -166,6 +166,10 @@
#define ITT_ARCH_VE 8
#endif /* ITT_ARCH_VE */
+#ifndef ITT_ARCH_S390X
+#define ITT_ARCH_S390X 8
+#endif /* ITT_ARCH_S390X */
+
#ifndef ITT_ARCH
#if defined _M_IX86 || defined __i386__
#define ITT_ARCH ITT_ARCH_IA32
@@ -181,6 +185,8 @@
#define ITT_ARCH ITT_ARCH_PPC64
#elif defined __ve__
#define ITT_ARCH ITT_ARCH_VE
+#elif defined __s390x__
+#define ITT_ARCH ITT_ARCH_S390X
#endif
#endif
diff --git a/openmp/runtime/src/z_Linux_asm.S b/openmp/runtime/src/z_Linux_asm.S
index 2c0df6e3b08505a..a72705528d4162e 100644
--- a/openmp/runtime/src/z_Linux_asm.S
+++ b/openmp/runtime/src/z_Linux_asm.S
@@ -2252,6 +2252,159 @@ __kmp_invoke_microtask:
#endif /* KMP_ARCH_VE */
+#if KMP_ARCH_S390X
+
+//------------------------------------------------------------------------
+//
+// typedef void (*microtask_t)(int *gtid, int *tid, ...);
+//
+// int __kmp_invoke_microtask(microtask_t pkfn, int gtid, int tid, int argc,
+// void *p_argv[]
+// #if OMPT_SUPPORT
+// ,
+// void **exit_frame_ptr
+// #endif
+// ) {
+// #if OMPT_SUPPORT
+// *exit_frame_ptr = OMPT_GET_FRAME_ADDRESS(0);
+// #endif
+//
+// (*pkfn)(>id, &tid, argv[0], ...);
+//
+// return 1;
+// }
+//
+// Parameters:
+// r2: pkfn
+// r3: gtid
+// r4: tid
+// r5: argc
+// r6: p_argv
+// SP+160: exit_frame_ptr
+//
+// Locals:
+// __gtid: gtid param pushed on stack so can pass >id to pkfn
+// __tid: tid param pushed on stack so can pass &tid to pkfn
+//
+// Temp. registers:
+//
+// r0: used to fetch argv slots
+// r7: used as temporary for number of remaining pkfn parms
+// r8: argv
+// r9: pkfn
+// r10: stack size
+// r11: previous fp
+// r12: stack parameter area
+// r13: argv slot
+//
+// return: r2 (always 1/TRUE)
+//
+
+// -- Begin __kmp_invoke_microtask
+// mark_begin;
+ .text
+ .globl __kmp_invoke_microtask
+ .p2align 1
+ .type __kmp_invoke_microtask, at function
+__kmp_invoke_microtask:
+ .cfi_startproc
+
+ stmg %r6,%r14,48(%r15)
+ .cfi_offset %r6, -112
+ .cfi_offset %r7, -104
+ .cfi_offset %r8, -96
+ .cfi_offset %r9, -88
+ .cfi_offset %r10, -80
+ .cfi_offset %r11, -72
+ .cfi_offset %r12, -64
+ .cfi_offset %r13, -56
+ .cfi_offset %r14, -48
+ .cfi_offset %r15, -40
+ lgr %r11,%r15
+ .cfi_def_cfa %r11, 160
+
+ // Compute the dynamic stack size:
+ //
+ // - We need 8 bytes for storing 'gtid' and 'tid', so we can pass them by
+ // reference
+ // - We need 8 bytes for each argument that cannot be passed to the 'pkfn'
+ // function by register. Given that we have 5 of such registers (r[2-6])
+ // and two + 'argc' arguments (consider >id and &tid), we need to
+ // reserve max(0, argc - 3)*8 extra bytes
+ //
+ // The total number of bytes is then max(0, argc - 3)*8 + 8
+
+ lgr %r10,%r5
+ aghi %r10,-2
+ jnm 0f
+ lghi %r10,0
+0:
+ sllg %r10,%r10,3
+ lgr %r12,%r10
+ aghi %r10,176
+ sgr %r15,%r10
+ agr %r12,%r15
+ stg %r11,0(%r15)
+
+ lgr %r9,%r2 // pkfn
+
+#if OMPT_SUPPORT
+ // Save frame pointer into exit_frame
+ lg %r8,160(%r11)
+ stg %r11,0(%r8)
+#endif
+
+ // Prepare arguments for the pkfn function (first 5 using r2-r6 registers)
+
+ stg %r3,160(%r12)
+ la %r2,164(%r12) // gid
+ stg %r4,168(%r12)
+ la %r3,172(%r12) // tid
+ lgr %r8,%r6 // argv
+
+ // If argc > 0
+ ltgr %r7,%r5
+ jz 1f
+
+ lg %r4,0(%r8) // argv[0]
+ aghi %r7,-1
+ jz 1f
+
+ // If argc > 1
+ lg %r5,8(%r8) // argv[1]
+ aghi %r7,-1
+ jz 1f
+
+ // If argc > 2
+ lg %r6,16(%r8) // argv[2]
+ aghi %r7,-1
+ jz 1f
+
+ lghi %r13,0 // Index [n]
+2:
+ lg %r0,24(%r13,%r8) // argv[2+n]
+ stg %r0,160(%r13,%r15) // parm[2+n]
+ aghi %r13,8 // Next
+ aghi %r7,-1
+ jnz 2b
+
+1:
+ basr %r14,%r9 // Call pkfn
+
+ // Restore stack and return
+
+ lgr %r15,%r11
+ lmg %r6,%r14,48(%r15)
+ lghi %r2,1
+ br %r14
+.Lfunc_end0:
+ .size __kmp_invoke_microtask, .Lfunc_end0-__kmp_invoke_microtask
+ .cfi_endproc
+
+// -- End __kmp_invoke_microtask
+
+#endif /* KMP_ARCH_S390X */
+
#if KMP_ARCH_ARM || KMP_ARCH_MIPS
.data
COMMON .gomp_critical_user_, 32, 3
@@ -2266,7 +2419,8 @@ __kmp_unnamed_critical_addr:
#endif /* KMP_ARCH_ARM */
#if KMP_ARCH_PPC64 || KMP_ARCH_AARCH64 || KMP_ARCH_MIPS64 || \
- KMP_ARCH_RISCV64 || KMP_ARCH_LOONGARCH64 || KMP_ARCH_VE
+ KMP_ARCH_RISCV64 || KMP_ARCH_LOONGARCH64 || KMP_ARCH_VE || \
+ KMP_ARCH_S390X
#ifndef KMP_PREFIX_UNDERSCORE
# define KMP_PREFIX_UNDERSCORE(x) x
#endif
@@ -2281,7 +2435,8 @@ KMP_PREFIX_UNDERSCORE(__kmp_unnamed_critical_addr):
.size KMP_PREFIX_UNDERSCORE(__kmp_unnamed_critical_addr),8
#endif
#endif /* KMP_ARCH_PPC64 || KMP_ARCH_AARCH64 || KMP_ARCH_MIPS64 ||
- KMP_ARCH_RISCV64 || KMP_ARCH_LOONGARCH64 || KMP_ARCH_VE */
+ KMP_ARCH_RISCV64 || KMP_ARCH_LOONGARCH64 || KMP_ARCH_VE ||
+ KMP_ARCH_S390X */
#if KMP_OS_LINUX
# if KMP_ARCH_ARM || KMP_ARCH_AARCH64
diff --git a/openmp/runtime/src/z_Linux_util.cpp b/openmp/runtime/src/z_Linux_util.cpp
index 5495f60d2029d49..2c331b8400468d5 100644
--- a/openmp/runtime/src/z_Linux_util.cpp
+++ b/openmp/runtime/src/z_Linux_util.cpp
@@ -2465,7 +2465,7 @@ int __kmp_get_load_balance(int max) {
#if !(KMP_ARCH_X86 || KMP_ARCH_X86_64 || KMP_MIC || \
((KMP_OS_LINUX || KMP_OS_DARWIN) && KMP_ARCH_AARCH64) || \
KMP_ARCH_PPC64 || KMP_ARCH_RISCV64 || KMP_ARCH_LOONGARCH64 || \
- KMP_ARCH_ARM || KMP_ARCH_VE)
+ KMP_ARCH_ARM || KMP_ARCH_VE || KMP_ARCH_S390X)
// we really only need the case with 1 argument, because CLANG always build
// a struct of pointers to shared variables referenced in the outlined function
diff --git a/openmp/runtime/test/lit.cfg b/openmp/runtime/test/lit.cfg
index 650d3853e851112..27ff057c85f60f2 100644
--- a/openmp/runtime/test/lit.cfg
+++ b/openmp/runtime/test/lit.cfg
@@ -51,6 +51,8 @@ flags = " -I " + config.test_source_root + \
" " + config.test_extra_flags
if config.has_omit_frame_pointer_flag:
flags += " -fno-omit-frame-pointer"
+if config.target_arch == "s390x":
+ flags += " -mbackchain"
config.test_flags = " -I " + config.omp_header_directory + flags
config.test_flags_use_compiler_omp_h = flags
diff --git a/openmp/runtime/test/ompt/callback.h b/openmp/runtime/test/ompt/callback.h
index c5266e230c26f77..efbd4c716e0ee1e 100644
--- a/openmp/runtime/test/ompt/callback.h
+++ b/openmp/runtime/test/ompt/callback.h
@@ -228,6 +228,22 @@ ompt_label_##id:
printf("%" PRIu64 ": current_address=%p or %p\n", \
ompt_get_thread_data()->value, ((char *)addr) - 8, \
((char *)addr) - 8)
+#elif KMP_ARCH_S390X
+// On s390x the NOP instruction is 2 bytes long. For non-void runtime
+// functions Clang inserts a STY instruction (but only if compiling under
+// -fno-PIC which will be the default with Clang 8.0, another 6 bytes).
+//
+// Another possibility is:
+//
+// brasl %r14,__kmpc_end_master at plt
+// a7 f4 00 02 j 0f
+// 47 00 00 00 0: nop
+// a7 f4 00 02 j addr
+// addr:
+#define print_possible_return_addresses(addr) \
+ printf("%" PRIu64 ": current_address=%p or %p or %p\n", \
+ ompt_get_thread_data()->value, ((char *)addr) - 2, \
+ ((char *)addr) - 8, ((char *)addr) - 12)
#else
#error Unsupported target architecture, cannot determine address offset!
#endif
diff --git a/openmp/runtime/tools/lib/Platform.pm b/openmp/runtime/tools/lib/Platform.pm
index d62d450e9e5dcf5..6efd932daef561b 100644
--- a/openmp/runtime/tools/lib/Platform.pm
+++ b/openmp/runtime/tools/lib/Platform.pm
@@ -65,6 +65,8 @@ sub canon_arch($) {
$arch = "riscv64";
} elsif ( $arch =~ m{\Aloongarch64} ) {
$arch = "loongarch64";
+ } elsif ( $arch =~ m{\As390x} ) {
+ $arch = "s390x";
} else {
$arch = undef;
}; # if
@@ -230,6 +232,8 @@ sub target_options() {
$_host_arch = "riscv64";
} elsif ( $hardware_platform eq "loongarch64" ) {
$_host_arch = "loongarch64";
+ } elsif ( $hardware_platform eq "s390x" ) {
+ $_host_arch = "s390x";
} else {
die "Unsupported host hardware platform: \"$hardware_platform\"; stopped";
}; # if
@@ -419,7 +423,7 @@ the script assumes host architecture is target one.
Input string is an architecture name to canonize. The function recognizes many variants, for example:
C<32e>, C<Intel64>, C<Intel(R) 64>, etc. Returned string is a canonized architecture name,
-one of: C<32>, C<32e>, C<64>, C<arm>, C<ppc64le>, C<ppc64>, C<mic>, C<mips>, C<mips64>, C<riscv64>, C<loongarch64> or C<undef> is input string is not recognized.
+one of: C<32>, C<32e>, C<64>, C<arm>, C<ppc64le>, C<ppc64>, C<mic>, C<mips>, C<mips64>, C<riscv64>, C<loongarch64>, C<s390x>, or C<undef> is input string is not recognized.
=item B<legal_arch( $arch )>
diff --git a/openmp/runtime/tools/lib/Uname.pm b/openmp/runtime/tools/lib/Uname.pm
index 8a976addcff03e0..9dde444d56a4ece 100644
--- a/openmp/runtime/tools/lib/Uname.pm
+++ b/openmp/runtime/tools/lib/Uname.pm
@@ -160,6 +160,8 @@ if ( 0 ) {
$values{ hardware_platform } = "riscv64";
} elsif ( $values{ machine } =~ m{\Aloongarch64\z} ) {
$values{ hardware_platform } = "loongarch64";
+ } elsif ( $values{ machine } =~ m{\As390x\z} ) {
+ $values{ hardware_platform } = "s390x";
} else {
die "Unsupported machine (\"$values{ machine }\") returned by POSIX::uname(); stopped";
}; # if
More information about the Openmp-commits
mailing list