<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/134451>134451</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [OpenMP][OMPT] `initial_device_num` on `initialize` callback set incorrectly
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Thyre
      </td>
    </tr>
</table>

<pre>
    We've recently released Score-P v9.0, having support for the OpenMP target via the OpenMP Tools Interface for the first time.

However, a user quickly noticed an issue, which boils down to a bug in LLVM.
When the OpenMP runtime initializes a connected tool, the following callback will be dispatched:

```c
typedef int (*ompt_initialize_t) (ompt_function_lookup_t lookup, int initial_device_num, ompt_data_t *tool_data);
```

`initial_device_num` is defined as "the value that a call to omp_get_initial_device would return". This itself is defined as being "the value of the device number is the value of omp_initial_device or the value returned by the omp_get_num_devices routine". 
Looking at the actual returned values however, this is not the case. Instead, LLVM always reports `initial_device_num = 0`. Since LLVM also does not implement `ompt_get_num_devices` correctly 
https://github.com/llvm/llvm-project/blob/19e0233eb844e653a3108de411366bd0165cf3ec/openmp/runtime/src/ompt-general.cpp#L868
a tool cannot safely determine if a device identifier in some target callback is actually the host.

LLVM typically uses `-1` for this, but tools should not have to rely on guessing this, as it might or might not change in later versions. This also makes supporting different runtimes more complicated.

---

To reproduce the issue:

```c
#include <omp.h>

#include <omp-tools.h>
#include <stdlib.h>
#include <assert.h>
#include <inttypes.h>
#include <string.h>

#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdatomic.h>

/* MAIN */

int main( int argc, char** argv )
{
#pragma omp target
    {
        printf( "Hello from target region\n" );
 }
}

/* OMPT INTERFACE */

#if ( defined( __ppc64__ ) || defined( __powerpc64__ ) || defined( __PPC64__ ) )
#define OMPT_TOOL_CPU_RELAX ( ( void )0 )
#elif ( defined( __x86_64 ) || defined( __x86_64__ ) || defined( __amd64 ) || defined( _M_X64 ) )
#define OMPT_TOOL_CPU_RELAX __asm__ volatile ( "pause" )
#elif ( defined( __aarch64__ ) || defined( __ARM64__ ) || defined( _M_ARM64 ) )
#define OMPT_TOOL_CPU_RELAX __asm__ volatile ( "yield" )
#else
#define OMPT_TOOL_CPU_RELAX ( ( void )0 )
#endif


/* Test-and-test-and-set lock. Mutexes are of type bool */
#define OMPT_TOOL_LOCK( MUTEX ) \
    while ( true ) \
    { \
        if ( atomic_flag_test_and_set_explicit( &( MUTEX ), memory_order_acquire ) != true ) \
        { \
            break; \
        } \
 OMPT_TOOL_CPU_RELAX; \
    }
#define OMPT_TOOL_UNLOCK( MUTEX ) atomic_flag_clear_explicit( &( MUTEX ), memory_order_release );

#define OMPT_TOOL_GUARDED_PRINTF( ... ) \
    OMPT_TOOL_LOCK( ompt_tool_printf_mutex ) \
    printf( __VA_ARGS__ ); \
 OMPT_TOOL_UNLOCK( ompt_tool_printf_mutex )


_Thread_local int32_t ompt_tool_tid          = -1; /* thread counter. >= 1 after thread_begin */
atomic_flag           ompt_tool_printf_mutex = ATOMIC_FLAG_INIT;


static int
my_initialize_tool( ompt_function_lookup_t lookup,
                    int                    initial_device_num,
 ompt_data_t*           tool_data )
{
 OMPT_TOOL_GUARDED_PRINTF( "[%s] tid = %" PRId32 " | initial_device_num %d\n",
                              __FUNCTION__,
 ompt_tool_tid,
                              initial_device_num );
 return 1; /* non-zero indicates success */
}

static void
my_finalize_tool( ompt_data_t* tool_data )
{
 OMPT_TOOL_GUARDED_PRINTF( "[%s] tid = %" PRId32 "\n",
 __FUNCTION__,
                              ompt_tool_tid );
}

ompt_start_tool_result_t*
ompt_start_tool( unsigned int omp_version,
                 const char*  runtime_version )
{
    setbuf( stdout, NULL );
    OMPT_TOOL_GUARDED_PRINTF( "[%s] tid = %" PRId32 " | omp_version %d | runtime_version = \'%s\'\n",
 __FUNCTION__,
                              ompt_tool_tid,
 omp_version,
                              runtime_version );
    static ompt_start_tool_result_t tool = { &my_initialize_tool,
 &my_finalize_tool,
 ompt_data_none };
    return &tool;
}
```

Building the reproducer on a system with at least one GPU (maybe `-fopenmp-targets=x86_64` works too), one will see the following result:

```console
$ clang -fopenmp --offload-arch=sm_75 test.c
$ ./a.out
[ompt_start_tool] tid = -1 | omp_version 201611 | runtime_version = 'LLVM OMP version: 5.0.20140926'
[my_initialize_tool] tid = -1 | initial_device_num 0
Hello from target region
[my_finalize_tool] tid = -1
```

As one can see `initial_device_num` is `0`. In my case, it should be either `-1` or `1`.
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJy0WFtz4zbP_jXMDcYemfJpL3KhjdfbzJfTbJ22dxqagm1-oUiVpJy6v_4dUJJPsdN2-r6azEQWSOAB8BAkKLxXa4N4y0Zf2Wh2I-qwse52sdk5vFnaYnf7KzI-2SI4lGiC3oFDjcJjAT9L67D3Atsv_YTxO9iIrTJr8HVVWRdgZR2EDcJzhebxBYJwawywVeL468Ja7eHeBHQrIXE_aaWcDxBUiX2WZCzJfrLvuEVHdgTUHh38Xiv5pndgbFASCxAGlPc10pD3jZIbWFqlPRT23UCwIGBZr0EZeHj45ZG0_rpBc4zF1YYMgjIqKKHVn-hBgLTGoAxYQLBWk_KIz2pt38ldKbReCvkG70prWCIUylciyA0WLM0a8GycNH-SJVnYVVjgCpQJwPiU8cyWVcgPVvPA-BcSxe-r2sigrMm1tW91lQdoXggJqWjn5QVulcTc1CVJ4tRCBJGTkYygx5-Mf2Hp12NIe4QXFI0TUB4KXClDAfbAOCfvt0LXCGEjAgVIaE3xtWWVr3HvSKsH3m2tC3AYamcY531YbJQHFTzq1Zn2JVJAT2zYVQx3q8vU5RIdzToZQZbPrLY0aoY0xrGA5S5-7ZCaumzHe3C2DspgRMiS7MHaNwIjQpwhZKiFPiiKej1sDqQM0StPbIwzpPDYh3vjA4qCBhDrQOh3sfPgkFaIh4tRB5bOgDLTh5-VkdjN9BYKi40FVVYaSyQGjZOY6zN_KHXSOoeSlixLsk0IlSdC8jnj87UKm3rZl7ZkfK71tvvXq5z9f5SB8flS2yXj88EXTHia4nI6HOJ4lIp0kEwLHA4G6Xi8LJLBeCRXKUrG57ZCU1aMz9uFxPjcuygoq9Bbo0EndF9WFePpw3Q8ZUkm4qICKQx55cUK9Q4KDOhKZRDUCkSXe1WgCWqlKP8GvC2xKyn7Fah8myjd5HljfWjLR4xh2FVKRmntMUa_N6BANUVHeUrTsg4Rkwe_icwlYBuxRaK4I3jWwLpG74ke3SxBjIZSrTeBqNe80Ey5EWZNJQW0COhgi84ra3y7DGJWS_GGviubpLVQqxU6ym4bSQ-ldQjSlpVWUgQsWrd6vV7zsiBwlbNFLTH63pTCiwWI8VQZqesCgaV3tqz6G5Z-aweey3oxGPsRJ2IfCq2Wl2XCe3ThskyZQEXwqlanzPoqJh-KpbX6KiJlrlglmb0qEsGWSp5YnTOewWN2_0QVlH7G71RzS6EM49NYf4VbS6KA3AgXx2X0aQtUaZOMTb421ion1qWg0tPSliUZAEAzANqncsqEFalmnP-EWltYOVt2THe4Vtaw0R1VUtjXcmCTWbQ1O4H-_PiygPunxbcf8-zu24kT5P6Ktpiu_NJrnleVHA_zHOL-M7ljk7szuX1H9xdjXl7uDvImBjxtRkRE-eL5-SG_e3nNf3x7yH6DxtkpbK0qaEJymIX6Esg_puN8PLxqvxF_AlCUxfXpj_lvnfBvYM9z4cs8h63VIiiNrTO8ErXHLkWfuCKEk5tPwWY_Hj-TPzYD_jXgnUJdnAH2-K9yZwq1asm2Z-QCfegJU_RC9-KRzjPyrQ-PdcA_6MTlml1_VyHQQj_i7QUsD893_0cIHl8X335rwjC6axfU-6bzMLgaz4Rs8vXoFz1tgppCkK-0WOcEMxemyD2GHP-g4qtC4_T4xCoVgBJL63a5dQW6XMjfa-Vao3xAW_oFEJeB0LN0KN5Yei5ik1n35UI-Tse35eBC0F6fPobt2G2pUbh_5m_bFBxq0hXT31-zH7Nvs_zlx_3TYk46-_3-WVQ-ZjeecOIZtimQeUlkOZt2qJ15_kuWZz--_9wsnKOwXIrBdeV7-uaLjUNR5NpKoanqpzwPRxODKg6po1z3BtFow_oQJ4O0NbU5faAtJp3BAMSKjgSNOF_iWpkD2Y_ycUSLa1jTGWSL58f7u3z-kH3P75_uF4c0JJkPIihJwFmSlbuTXiM2NW0UPuk1zvi5XzQmXP58oSkhFUd9CYXm8Ow7lJON81PmMM6pa-Ujz0YzoBxQHBgfUSF7-XFfpFTPONXNC4BoYNFupdf9Ozx5Pn99ulvcPz_l-bEzHQH-jo6LKLpNvGku4Jg5xpren-gsKFPEcx8dE6VE74-qYrfrt0mmWtxkeaXMhRwfgv-_C_lpWC9E7tPndF0dKkrnaJT7IFw7yqGvdYg-fZQS6trES44ispWav_YUfgWMtMaH7jwH3Rm8m3QaLADwGJZ1rDs-FLYOVBufXh8ejlJ7UtT-HZGP4EcGx48fIKbNNsEnjdL49t_LyYH-fxHJk-dSHPfhacl7LbVNmxjdou2Sjy8VsQihkZ1R_6z0GGswbo978-3aY3wcJxzz7fSe5GutdNF0fnjouBw1hQL8zgcs4V2FDYgAtCMGIFvfX17pbFGK3RJj27lqeuVec673LJ01x1bqRt-te_PkcLvPkoJ4t-QRz-6emvBc6PKs8VY357chSC3MGjqb0OvZ1UpbUfTo9MnSmS_zyQjorNOX7ZQ-43PRJzInGRt9PV9TRyztDT7QkieD8WBwnZh8Ervx58eXrh1maQajftLnyWCYfKGzxqQxfCHNH21fqKqUravtU6f5lCQnej8mPvMxEVKYmIfP7svYOGmub-4NlLt4ExQv60J3pbBEQBU26A5XEDa-02v_prhNiy_pF3GDt4PJMJ0m6XjIbza3fDQZyaKY8umkGCAORqNRKsfptBArHI3G8kbd8oSPkmEy5IPRMOH9Qk7G4zSZYlqIZDocsGGCpVC6r_W27Fu3vonXBLeDdDgcDW60WKL28SqYc4Pv3X0qZ6PZjbuNF0TLeu3ZMNHKB39QE1TQ8Q65uUploxm9P74sKKxXY0WE2MvUnxgvrbrLHOoMlNnfYd3UTt_-40us6IBnfN56uL3l_wkAAP__vm4B0g">