[Openmp-commits] [PATCH] D126836: [LIBOMPTARGET] Adding AMD to llvm-omp-device-info

Jose Manuel Monsalve Diaz via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Wed Jun 1 15:46:19 PDT 2022


josemonsalve2 created this revision.
josemonsalve2 added reviewers: jdoerfert, jhuber6, JonChesterfield.
Herald added subscribers: kosarev, kerbowa, jvesely.
Herald added a project: All.
josemonsalve2 requested review of this revision.
Herald added subscribers: openmp-commits, sstefan1.
Herald added a project: OpenMP.

Adding device information print for AMD devices on the
`llvm-omp-device-info` command line tool. The output is inspired by
the rocminfo command line tool.

This commit adds missing HSA functions, enums and structs
needed to query additional information from the HSA agents.
A generic message for the `generic-elf-64bit` plugin is also added

Example of an output:

  llvm-omp-device-info
  Device (0):
      This is a generic-elf-64bit device
  
  Device (1):
      This is a generic-elf-64bit device
  
  Device (2):
      This is a generic-elf-64bit device
  
  Device (3):
      This is a generic-elf-64bit device
  
  Device (4):
      HSA Runtime Version:                1.1
      HSA OpenMP Device Number:           0
      Device Name:                        gfx906
      Vendor Name:                        AMD
      Device Type:                        GPU
      Max Queues:                         128
      Queue Min Size:                     64
      Queue Max Size:                     131072
      Cache:
        L0:                               16384 bytes
        L1:                               8388608 bytes
      Cacheline Size:                     64
      Max Clock Freq(MHz):                1725
      Compute Units:                      60
      SIMD per CU:                        4
      Fast F16 Operation:                 TRUE
      Wavefront Size:                     64
      Workgroup Max Size:                 1024
      Workgroup Max Size per Dimension:
        x:                                1024
        y:                                1024
        z:                                1024
      Max Waves Per CU:                   40
      Max Work-item Per CU:               2560
      Grid Max Size:                      4294967295
      Grid Max Size per Dimension:
        x:                                4294967295
        y:                                4294967295
        z:                                4294967295
      Max fbarriers/Workgrp:              32
      Memory Pools:
        Pool GLOBAL; FLAGS: COARSE GRAINED, :
          Size:                            34342961152 bytes
          Allocatable:                     TRUE
          Runtime Alloc Granule:           4096 bytes
          Runtime Alloc alignment:         4096 bytes
          Accessable by all:               FALSE
        Pool GLOBAL; FLAGS: FINE GRAINED, :
          Size:                            34342961152 bytes
          Allocatable:                     TRUE
          Runtime Alloc Granule:           4096 bytes
          Runtime Alloc alignment:         4096 bytes
          Accessable by all:               FALSE
        Pool GROUP:
          Size:                            65536 bytes
          Allocatable:                     FALSE
          Runtime Alloc Granule:           0 bytes
          Runtime Alloc alignment:         0 bytes
          Accessable by all:               FALSE
  
  Device (5):
      HSA Runtime Version:                1.1
      HSA OpenMP Device Number:           1
      Device Name:                        gfx906
      Vendor Name:                        AMD
      Device Type:                        GPU
      Max Queues:                         128
      Queue Min Size:                     64
      Queue Max Size:                     131072
      Cache:
        L0:                               16384 bytes
        L1:                               8388608 bytes
      Cacheline Size:                     64
      Max Clock Freq(MHz):                1725
      Compute Units:                      60
      SIMD per CU:                        4
      Fast F16 Operation:                 TRUE
      Wavefront Size:                     64
      Workgroup Max Size:                 1024
      Workgroup Max Size per Dimension:
        x:                                1024
        y:                                1024
        z:                                1024
      Max Waves Per CU:                   40
      Max Work-item Per CU:               2560
      Grid Max Size:                      4294967295
      Grid Max Size per Dimension:
        x:                                4294967295
        y:                                4294967295
        z:                                4294967295
      Max fbarriers/Workgrp:              32
      Memory Pools:
        Pool GLOBAL; FLAGS: COARSE GRAINED, :
          Size:                            34342961152 bytes
          Allocatable:                     TRUE
          Runtime Alloc Granule:           4096 bytes
          Runtime Alloc alignment:         4096 bytes
          Accessable by all:               FALSE
        Pool GLOBAL; FLAGS: FINE GRAINED, :
          Size:                            34342961152 bytes
          Allocatable:                     TRUE
          Runtime Alloc Granule:           4096 bytes
          Runtime Alloc alignment:         4096 bytes
          Accessable by all:               FALSE
        Pool GROUP:
          Size:                            65536 bytes
          Allocatable:                     FALSE
          Runtime Alloc Granule:           0 bytes
          Runtime Alloc alignment:         0 bytes
          Accessable by all:               FALSE
  
  Device (6):
      HSA Runtime Version:                1.1
      HSA OpenMP Device Number:           2
      Device Name:                        gfx906
      Vendor Name:                        AMD
      Device Type:                        GPU
      Max Queues:                         128
      Queue Min Size:                     64
      Queue Max Size:                     131072
      Cache:
        L0:                               16384 bytes
        L1:                               8388608 bytes
      Cacheline Size:                     64
      Max Clock Freq(MHz):                1725
      Compute Units:                      60
      SIMD per CU:                        4
      Fast F16 Operation:                 TRUE
      Wavefront Size:                     64
      Workgroup Max Size:                 1024
      Workgroup Max Size per Dimension:
        x:                                1024
        y:                                1024
        z:                                1024
      Max Waves Per CU:                   40
      Max Work-item Per CU:               2560
      Grid Max Size:                      4294967295
      Grid Max Size per Dimension:
        x:                                4294967295
        y:                                4294967295
        z:                                4294967295
      Max fbarriers/Workgrp:              32
      Memory Pools:
        Pool GLOBAL; FLAGS: COARSE GRAINED, :
          Size:                            34342961152 bytes
          Allocatable:                     TRUE
          Runtime Alloc Granule:           4096 bytes
          Runtime Alloc alignment:         4096 bytes
          Accessable by all:               FALSE
        Pool GLOBAL; FLAGS: FINE GRAINED, :
          Size:                            34342961152 bytes
          Allocatable:                     TRUE
          Runtime Alloc Granule:           4096 bytes
          Runtime Alloc alignment:         4096 bytes
          Accessable by all:               FALSE
        Pool GROUP:
          Size:                            65536 bytes
          Allocatable:                     FALSE
          Runtime Alloc Granule:           0 bytes
          Runtime Alloc alignment:         0 bytes
          Accessable by all:               FALSE
  
  Device (7):
      HSA Runtime Version:                1.1
      HSA OpenMP Device Number:           3
      Device Name:                        gfx906
      Vendor Name:                        AMD
      Device Type:                        GPU
      Max Queues:                         128
      Queue Min Size:                     64
      Queue Max Size:                     131072
      Cache:
        L0:                               16384 bytes
        L1:                               8388608 bytes
      Cacheline Size:                     64
      Max Clock Freq(MHz):                1725
      Compute Units:                      60
      SIMD per CU:                        4
      Fast F16 Operation:                 TRUE
      Wavefront Size:                     64
      Workgroup Max Size:                 1024
      Workgroup Max Size per Dimension:
        x:                                1024
        y:                                1024
        z:                                1024
      Max Waves Per CU:                   40
      Max Work-item Per CU:               2560
      Grid Max Size:                      4294967295
      Grid Max Size per Dimension:
        x:                                4294967295
        y:                                4294967295
        z:                                4294967295
      Max fbarriers/Workgrp:              32
      Memory Pools:
        Pool GLOBAL; FLAGS: COARSE GRAINED, :
          Size:                            34342961152 bytes
          Allocatable:                     TRUE
          Runtime Alloc Granule:           4096 bytes
          Runtime Alloc alignment:         4096 bytes
          Accessable by all:               FALSE
        Pool GLOBAL; FLAGS: FINE GRAINED, :
          Size:                            34342961152 bytes
          Allocatable:                     TRUE
          Runtime Alloc Granule:           4096 bytes
          Runtime Alloc alignment:         4096 bytes
          Accessable by all:               FALSE
        Pool GROUP:
          Size:                            65536 bytes
          Allocatable:                     FALSE
          Runtime Alloc Granule:           0 bytes
          Runtime Alloc alignment:         0 bytes
          Accessable by all:               FALSE


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D126836

Files:
  openmp/libomptarget/plugins/amdgpu/dynamic_hsa/hsa.cpp
  openmp/libomptarget/plugins/amdgpu/dynamic_hsa/hsa.h
  openmp/libomptarget/plugins/amdgpu/dynamic_hsa/hsa_ext_amd.h
  openmp/libomptarget/plugins/amdgpu/src/rtl.cpp
  openmp/libomptarget/plugins/generic-elf-64bit/src/rtl.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D126836.433580.patch
Type: text/x-patch
Size: 18640 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20220601/f59ad3b7/attachment-0001.bin>


More information about the Openmp-commits mailing list