[llvm-dev] Intel AMX programming model discussion.

Wed Aug 19 12:52:19 PDT 2020

On 8/19/20 10:24 AM, Kaylor, Andrew wrote:
>
> > When the tile shape is unknown at compile time, how do you plan to 
> do the register allocation of the tiles? My question is: do you do the 
> allocation for this case in the same way as you would if you knew the 
> size was 16x16 (i.e., conservatively assume the largest size)?
>
> I think what will happen is that the registers are allocated based on 
> a number of runtime values that are assumed to be different from one 
> another but less than or equal to 16. So, for example, we’ll allocate 
> registers for MxN tiles, NxM tiles and MxM tiles without knowing what 
> M and N are. Then at runtime the values of these variables will be 
> used to create the actual tile configuration. The instructions that 
> need to know the shape take these runtime values as operands.
>

So you're going to multiversion the code?

In any case, my point is that you probably don't need a custom register 
allocator. If you just define the tile registers and make sure that the 
ldtilecfgs implicitly defines them all, then the regular infrastructure 
likely works. You'll have a bunch of register classes, but that's not 
necessarily a problem. I recommend trying this, and let us know what you 
discover, before we go down the road of a new, dedicated allocator just 
for these registers.

  -Hal

> There may be some artifacts coming from the front end that 
> conservatively assume a 16x16 tile, but I think those generally go 
> away in SROA or later specialized passes. Yuanke can confirm or 
> correct my understanding of this.
>
> *From:* Hal Finkel <hfinkel at anl.gov>
> *Sent:* Wednesday, August 19, 2020 5:14 AM
> *To:* Luo, Yuanke <yuanke.luo at intel.com>; Kaylor, Andrew 
> <andrew.kaylor at intel.com>; Philip Reames <listmail at philipreames.com>; 
> llvm-dev at lists.llvm.org; florian_hahn at apple.com; Topper, Craig 
> <craig.topper at intel.com>; Lu, Hongjiu <hongjiu.lu at intel.com>
> *Subject:* Re: [llvm-dev] Intel AMX programming model discussion.
>
> On 8/19/20 5:34 AM, Luo, Yuanke wrote:
>
>     There is no problem to have 256 register classes. Just a lot of
>     register classes to me.
>
>     We don’t assume the shape of each physical register be 16x16, it
>     is defined by user. For variable shape, I mean the shape is known
>     in runtime and in compile time the shape is unknown. Take below
>     code as an example, the %row and %col are variable instead of
>     constant. Compiler recognizes llvm.x86.tileloadd64 and deduce the
>     shape of %0 is %row x %col.
>
>     %0 = tail call <256 x i32> @llvm.x86.tileloadd64(i16 %row, i16
>     %col, i8* getelementptr inbounds ([1024 x i8], [1024 x i8]* @buf,
>     i64 0, i64 0), i64 32)
>
> When the tile shape is unknown at compile time, how do you plan to do 
> the register allocation of the tiles? My question is: do you do the 
> allocation for this case in the same way as you would if you knew the 
> size was 16x16 (i.e., conservatively assume the largest size)?
>
> Thanks again,
>
> Hal
>
>     *From:* Hal Finkel <hfinkel at anl.gov> <mailto:hfinkel at anl.gov>
>     *Sent:* Wednesday, August 19, 2020 4:58 PM
>     *To:* Luo, Yuanke <yuanke.luo at intel.com>
>     <mailto:yuanke.luo at intel.com>; Kaylor, Andrew
>     <andrew.kaylor at intel.com> <mailto:andrew.kaylor at intel.com>; Philip
>     Reames <listmail at philipreames.com>
>     <mailto:listmail at philipreames.com>; llvm-dev at lists.llvm.org
>     <mailto:llvm-dev at lists.llvm.org>; florian_hahn at apple.com
>     <mailto:florian_hahn at apple.com>; Topper, Craig
>     <craig.topper at intel.com> <mailto:craig.topper at intel.com>; Lu,
>     Hongjiu <hongjiu.lu at intel.com> <mailto:hongjiu.lu at intel.com>
>     *Subject:* Re: [llvm-dev] Intel AMX programming model discussion.
>
>     On 8/19/20 2:21 AM, Luo, Yuanke wrote:
>
>         Hi Hal,
>
>         There is 3 aspect to be solved.
>
>         1.The HW support max shape 16x16, so there are many register
>         classes from 1x1 to 16x16. We need 256 register classes.
>
>         2.We want to support variable shape, so compiler don’t know
>         what register class to fit tile shape as it is only known in
>         runtime.
>
>         3.The tile configure is to configure physical tile register,
>         so we need to allocate register and then we know the shape of
>         each physical tile register and configure the tile register.
>
>         I think your suggestion is helpful to reduce the complexity if
>         we only support fixed (constant) tile shape.
>
>         -Yuanke
>
>     Thanks, Yuanke.
>
>     It's not clear to me that having 256 register classes is, in
>     itself, a problem. Is it?
>
>     What does it mean to support variable-shape tiles in this context?
>     Do you do something other than conservatively assume that they are
>     16x16 for register-allocation purposes?
>
>      -Hal
>
>         *From:* Hal Finkel <hfinkel at anl.gov> <mailto:hfinkel at anl.gov>
>         *Sent:* Wednesday, August 19, 2020 8:20 AM
>         *To:* Kaylor, Andrew <andrew.kaylor at intel.com>
>         <mailto:andrew.kaylor at intel.com>; Philip Reames
>         <listmail at philipreames.com>
>         <mailto:listmail at philipreames.com>; Luo, Yuanke
>         <yuanke.luo at intel.com> <mailto:yuanke.luo at intel.com>;
>         llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>;
>         florian_hahn at apple.com <mailto:florian_hahn at apple.com>;
>         Topper, Craig <craig.topper at intel.com>
>         <mailto:craig.topper at intel.com>; Lu, Hongjiu
>         <hongjiu.lu at intel.com> <mailto:hongjiu.lu at intel.com>
>         *Subject:* Re: [llvm-dev] Intel AMX programming model discussion.
>
>         Hi, Andy,
>
>         I don't quite understand everything that's going on here.
>         Could we model this as:
>
>          1. Define a collection of register classes, one for 2x4
>         tiles, one for 4x2 tiles, etc. each populated with a set of
>         tile registers. Registers can have aliasing relationships
>         (instead of worrying of any kind of subregister/superregister
>         relationships -- these won't be useful anyway).
>
>          2. Define the tile-configuration instructions so that they
>         implicitly define all of the registers in all of the classes.
>
>         Then you would still need to pre-schedule the tile operations
>         as you've described, and collect the configuration information
>         in order to add the ldtilecfgs, but the regular register
>         allocator can handle the allocation itself in the usual way.
>         What do you think?
>
>          -Hal
>
>         On 8/18/20 6:58 PM, Kaylor, Andrew via llvm-dev wrote:
>
>             The AMX registers are complicated. The single
>             configuration register (which is mostly used implicitly,
>             similar to MXCSR for floating point) controls the shape of
>             all the tile registers, and if you change the tile
>             configuration every single tile register is cleared. In
>             practice, if we have to change the the configuration while
>             any of the tile registers are live, performance is going
>             to be terrible. We need to handle this case for
>             correctness, but users of this programming interface will
>             need to have enough awareness of the performance issues
>             and the hardware details to prevent this. We’ll also want
>             a diagnostic that lets the user know when this has happened.
>
>             When the tile configuration is set, the shape of each tile
>             is locked in, so the individual tile registers aren’t
>             interchangeable at that point. If a function needs 2x4
>             tiles, 4x2 tiles, and 4x4 tiles, the configuration needs
>             to be set with this in mind. The shape isn’t explicit in
>             every instruction and intrinsic. It must be deduced. And
>             again, we’ll need a way to tell the user when efficient
>             allocation can’t be done. In practice, I don’t expect any
>             function to be using more than three tile shapes.
>
>             The implication of all this is that I don’t think the
>             greedy register allocator is well suited to figure all of
>             this out. We need a special pass to pre-allocate these
>             registers. If the function is written in a way that makes
>             good performance possible, it should be a relatively
>             simple task to allocate everything with minimal spilling.
>             If it isn’t possible to get good performance, we don’t
>             need to do anything especially clever. We can just do
>             something straightforward that is correct and let the user
>             know that they aren’t going to be happy with the results.
>
>             -Andy
>
>             *From:* Philip Reames <listmail at philipreames.com>
>             <mailto:listmail at philipreames.com>
>             *Sent:* Friday, August 14, 2020 8:29 PM
>             *To:* Luo, Yuanke <yuanke.luo at intel.com>
>             <mailto:yuanke.luo at intel.com>; llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>; florian_hahn at apple.com
>             <mailto:florian_hahn at apple.com>; Kaylor, Andrew
>             <andrew.kaylor at intel.com>
>             <mailto:andrew.kaylor at intel.com>; Topper, Craig
>             <craig.topper at intel.com> <mailto:craig.topper at intel.com>;
>             Lu, Hongjiu <hongjiu.lu at intel.com>
>             <mailto:hongjiu.lu at intel.com>
>             *Subject:* Re: [llvm-dev] Intel AMX programming model
>             discussion.
>
>             I find your answer unconvincing.  I'm not going to debate
>             it as I don't wish to take the time to build the
>             appropriate context, but my initial response is skepticism.
>
>             Philip
>
>             On 8/14/20 4:49 PM, Luo, Yuanke wrote:
>
>                 [Yuanke] AMX register is special. It needs to be
>                 configured before use and the config instruction is
>                 expensive. To avoid unnecessary tile configure, we
>                 collect the tile shape information as much as possible
>                 and combine them into one ldtilecfg instruction. The
>                 ldtilecfg instruction should dominate any AMX
>                 instruction that access tile register. On the other
>                 side, the ldtilecfg should post-dominated the
>                 instruction that define the tile shape. For tile
>                 register spill, it should avoid re-config due to the
>                 different tile shape, the spilled register should be
>                 reloaded to the register that share the same tile
>                 shape. Since tile register allocation is special and
>                 it may allocate general virtual register to configure
>                 tile register, we can add a sperate pass to do it
>                 before general register allocation pass. After
>                 register allocation, the tile shape information is not
>                 needed anymore, so we can transform the pseudo AMX
>                 instruction to real AMX instruction by removing the
>                 row and column operands.
>
>                 [Philip]
>
>                 This seems complicated.
>
>                 Reading through the documentation, there appears to be
>                 a single global tile config for all tile registers at
>                 any time.
>
>                 Why not simply model this tile config as a designated
>                 special register and the tile instructions as having
>                 an implicit use of this register?  That would seem to
>                 ensure that the register allocator has all the
>                 constraints needed.  You'd need to teach it how to
>                 spill the special registers with the appropriate
>                 instructions, but that seems a lot more straight forward?
>
>                 [Yuanke] In that case user need to configure the tile
>                 register by themselves. Spilling configure register is
>                 very expensive, because it clears all the tile data
>                 register to zero. In our proposal, compiler is
>                 responsible to deduce the shape for virtual of tile
>                 data register, allocate physical registers for them
>                 and then configure those physical register. We may
>                 build the dependency as you proposed and it can be
>                 used for machine IR check to ensure tile data register
>                 is configured before use.
>
>                 *From:* Philip Reames <listmail at philipreames.com>
>                 <mailto:listmail at philipreames.com>
>                 *Sent:* Saturday, August 15, 2020 1:17 AM
>                 *To:* Luo, Yuanke <yuanke.luo at intel.com>
>                 <mailto:yuanke.luo at intel.com>; llvm-dev at lists.llvm.org
>                 <mailto:llvm-dev at lists.llvm.org>;
>                 florian_hahn at apple.com
>                 <mailto:florian_hahn at apple.com>; Kaylor, Andrew
>                 <andrew.kaylor at intel.com>
>                 <mailto:andrew.kaylor at intel.com>; Topper, Craig
>                 <craig.topper at intel.com>
>                 <mailto:craig.topper at intel.com>; Lu, Hongjiu
>                 <hongjiu.lu at intel.com> <mailto:hongjiu.lu at intel.com>
>                 *Subject:* Re: [llvm-dev] Intel AMX programming model
>                 discussion.
>
>                 On 8/14/20 6:27 AM, Luo, Yuanke via llvm-dev wrote:
>
>                     Hi,
>
>                     Intel Advanced Matrix Extensions (Intel AMX) is a
>                     new programming paradigm consisting of two
>                     components: a set of 2-dimensional registers
>                     (tiles) representing sub-arrays from a larger
>                     2-dimensional memory image, and accelerators able
>                     to operate on tiles. Capability of Intel AMX
>                     implementation is enumerated by palettes. Two
>                     palettes are supported: palette 0 represents the
>                     initialized state and palette 1 consists of 8 tile
>                     registers of up to 1 KB size, which is controlled
>                     by a tile control register.
>
>                     The instruction manual is posted at
>                     https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
>                     <https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html>.
>
>                     The AMX abi proposal is posted at
>                     https://groups.google.com/g/x86-64-abi/c/NRejFm7pwb4
>                     <https://groups.google.com/g/x86-64-abi/c/NRejFm7pwb4>.
>
>                     This email is to discuss the programming model for
>                     AMX. Florian has introduced the matrix type and
>                     intrinsics in LLVM community. We’d like to adopt
>                     some ideas from it.
>
>                     Here is what we propose for the AMX programming model.
>
>                     1. Data type.
>
>                     We’d like to have fixed vector type for AMX. Since
>                     the shape to AMX register can be configurable, the
>                     vector size is the maximum size of AMX register.
>                     That means the vector size is 1024 bytes.
>
>                     The C code may look like this.
>
>                     typedef int _tile_data
>                     __attribute__((__vector_size__(1024),
>                     __aligned__(64)));
>
>                     _tile_data tile;
>
>                     And the LLVM IR may look like this.
>
>                     @tile = dso_local local_unnamed_addr global <256 x
>                     i32> zeroinitializer, align 64
>
>                     For llvm IR, it is nice to have a new type
>                     x86_amxtile that can be mapped to AMX registers.
>
>                     2.AMX Intrinsics.
>
>                     The internal intrinsics are 1:1 mapped to AMX
>                     instructions. The parameter m, n, k identifies the
>                     shape of the tile. The shape can be variable, but
>                     it cannot exceed the size that AMX HW can support.
>                     Compiler can deduce shape of the tile from the AMX
>                     intrinsics.
>
>                     _tile_data _tile_loadd_internal(char m, short n,
>                     const void *base, int stride);
>
>                     _tile_data _tile_dpbssd_internal(char m, short n,
>                     short k, _tile_data dst, _tile_data src1,
>                     _tile_data src2);
>
>                     _tile_data _tile_dpbf16ps_internal(char m, short
>                     n, short k, _tile_data dst, _tile_data src1,
>                     _tile_data src2);
>
>                     void _tile_stored_internal(char m, short n, void
>                     *base, int stride, _tile_data tile);
>
>                     3.User interfaces.
>
>                     The tile shape and tile data are combined into a
>                     struct in C language. The shape of the tile is
>                     only allowed to be initialized once. The user
>                     interface looks as this.
>
>                        3  #define __DEFAULT_FN_AMX \
>
>                        4 __attribute__((__always_inline__,
>                     __nodebug__, __target__("amx-int8")))
>
>                        9 typedef struct __tile_str {
>
>                     10   const char row;
>
>                     11   const short col;
>
>                     12   _tile_data tile;
>
>                     13 }__tile;
>
>                     14
>
>                     15 __DEFAULT_FN_AMX
>
>                     16 void __tile_loadd(__tile *dst, const void
>                     *base, long stride) {
>
>                     17   dst->tile = _tile_loadd_internal(dst->row,
>                     dst->col, base, stride);
>
>                     18 }
>
>                     19
>
>                     20 __DEFAULT_FN_AMX
>
>                     21 void __tile_dpbsud(__tile *dst, __tile src1,
>                     __tile src2) {
>
>                     22   dst->tile = _tile_dpbssd_internal(src1.row,
>                     src2.col, src1.col, dst->tile, src1.tile, src2.tile);
>
>                     23 }
>
>                     24
>
>                     25 __DEFAULT_FN_AMX
>
>                     26 void __tile_stored(void *base, long stride,
>                     __tile src) {
>
>                     27 _tile_stored_internal(src.row, src.col, base,
>                     stride, src.tile);
>
>                     28 }
>
>                     4.Example code
>
>                     The example shows how to use the user interface in
>                     a function.
>
>                      51 void api(int cond, short row, short col) {
>
>                     52   __tile a = {row, col};
>
>                     53   __tile b = {row, col};
>
>                     54   __tile c = {row, col};
>
>                     55
>
>                     56   if(cond) {
>
>                     57     __tile_loadd(&a, buf, STRIDE);
>
>                     58     __tile_loadd(&b, buf, STRIDE);
>
>                     59     __tile_loadd(&c, buf, STRIDE);
>
>                     60   } else {
>
>                     61     __tile_loadd(&a, buf2, STRIDE);
>
>                     62     __tile_loadd(&b, buf2, STRIDE);
>
>                     63     __tile_loadd(&c, buf2, STRIDE);
>
>                     64   }
>
>                     65 __tile_dpbsud(&c, a, b);
>
>                     66   __tile_stored(buf, STRIDE, c);
>
>                     67 }
>
>                     5.LLVM IR
>
>                     The LLVM intrinsics IR take the row and column
>                     information as the input parameter, so that
>                     compiler can deduce the shape of tile data. The
>                     remaining parameters are what AMX instructions
>                     require. This is the LLVM IR corresponding to the
>                     example code.
>
>                     12 define dso_local void @api(i32 %cond, i16
>                     signext %row, i16 signext %col) local_unnamed_addr
>                     #2 {
>
>                     13 entry:
>
>                     14   %tobool = icmp eq i32 %cond, 0
>
>                     15   %sext = shl i16 %col, 8
>
>                     16   %conv.i31 = ashr exact i16 %sext, 8
>
>                     17   br i1 %tobool, label %if.else, label %if.then
>
>                     18
>
>                     19
>                     if.then:                                         
>                     ; preds = %entry
>
>                     20   %0 = tail call <256 x i32>
>                     @llvm.x86.tileloadd64(i16 %row, i16 %conv.i31, i8*
>                     getelementptr inbounds ([1024 x i8], [1024 x i8]*
>                     @buf, i64 0, i64 0), i64 32) #3
>
>                     21   %1 = tail call <256 x i32>
>                     @llvm.x86.tileloadd64(i16 %row, i16 %conv.i31, i8*
>                     getelementptr inbounds ([1024 x i8], [1024 x i8]*
>                     @buf, i64 0, i64 0), i64 32) #3
>
>                     22   %2 = tail call <256 x i32>
>                     @llvm.x86.tileloadd64(i16 %row, i16 %conv.i31, i8*
>                     getelementptr inbounds ([1024 x i8], [1024 x i8]*
>                     @buf, i64 0, i64 0), i64 32) #3
>
>                     23   br label %if.end
>
>                     24
>
>                     25 if.else:                     ; preds = %entry
>
>                     26   %3 = tail call <256 x i32>
>                     @llvm.x86.tileloadd64(i16 %row, i16 %conv.i31, i8*
>                     getelementptr inbounds ([1024 x i8], [1024 x i8]*
>                     @buf2, i64 0, i64 0), i64 32) #3
>
>                     27   %4 = tail call <256 x i32>
>                     @llvm.x86.tileloadd64(i16 %row, i16 %conv.i31, i8*
>                     getelementptr inbounds ([1024 x i8], [1024 x i8]*
>                     @buf2, i64 0, i64 0), i64 32) #3
>
>                     28   %5 = tail call <256 x i32>
>                     @llvm.x86.tileloadd64(i16 %row, i16 %conv.i31, i8*
>                     getelementptr inbounds ([1024 x i8], [1024 x i8]*
>                     @buf2, i64 0, i64 0), i64 32) #3
>
>                     29   br label %if.end
>
>                     30
>
>                     31
>                     if.end:                                          
>                     ; preds = %if.else, %if.then
>
>                     32   %a.sroa.1186.0 = phi <256 x i32> [ %3,
>                     %if.else ], [ %0, %if.then ]
>
>                     33   %b.sroa.1068.0 = phi <256 x i32> [ %4,
>                     %if.else ], [ %1, %if.then ]
>
>                     34   %c.sroa.1149.0 = phi <256 x i32> [ %5,
>                     %if.else ], [ %2, %if.then ]
>
>                     35   %6 = tail call <256 x i32>
>                     @llvm.x86.tdpbssd(i16 %row, i16 %conv.i31, i16
>                     %conv.i31, <256 x i32> %c.sroa.1149.0, <256 x i32>
>                     %a.sroa.1186.0, <256 x i32> %b.sroa.1068.0) #3
>
>                     36   tail call void @llvm.x86.tilestored64(i16
>                     %row, i16 %conv.i31, i8* getelementptr inbounds
>                     ([1024 x i8], [1024 x i8]* @buf, i64 0, i64 0),
>                     i64 32, <256 x i32> %6) #3
>
>                     37   ret void
>
>                     38 }
>
>                     6.Shape propagation
>
>                     When in -O0 build, some general load/store for
>                     tile vector is generated by front-end. We need to
>                     root from AMX intrinsics to propagate the shape
>                     information to the virtual tile register. If the
>                     an AMX intrinsic use the result of load
>                     instruction, the shape is propagated to the load
>                     and the load is transformed to tile load
>                     intrinsic. If the store instruction uses any
>                     result of AMX intrinsic, the shape is propagated
>                     to store instruction and the store is transformed
>                     to tile store intrinsic
>
>                     7.Machine IR
>
>                     Since the AMX intrinsics take the row and column
>                     as the input parameters, we can create a pseudo
>                     instruction corresponding to it. The AMX
>                     intrinsics are lowered to the pseudo AMX
>                     instruction which has extra row and column
>                     operands corresponding to AMX intrinsic. The real
>                     AMX instructions don’t need the row and column
>                     operands. The row and column information should be
>                     configured by ldtilecfg before executing any AMX
>                     instruction.
>
>                     8.Register allocation
>
>                     AMX register is special. It needs to be configured
>                     before use and the config instruction is
>                     expensive. To avoid unnecessary tile configure, we
>                     collect the tile shape information as much as
>                     possible and combine them into one ldtilecfg
>                     instruction. The ldtilecfg instruction should
>                     dominate any AMX instruction that access tile
>                     register. On the other side, the ldtilecfg should
>                     post-dominated the instruction that define the
>                     tile shape. For tile register spill, it should
>                     avoid re-config due to the different tile shape,
>                     the spilled register should be reloaded to the
>                     register that share the same tile shape. Since
>                     tile register allocation is special and it may
>                     allocate general virtual register to configure
>                     tile register, we can add a sperate pass to do it
>                     before general register allocation pass. After
>                     register allocation, the tile shape information is
>                     not needed anymore, so we can transform the pseudo
>                     AMX instruction to real AMX instruction by
>                     removing the row and column operands.
>
>                 This seems complicated.
>
>                 Reading through the documentation, there appears to be
>                 a single global tile config for all tile registers at
>                 any time.
>
>                 Why not simply model this tile config as a designated
>                 special register and the tile instructions as having
>                 an implicit use of this register?  That would seem to
>                 ensure that the register allocator has all the
>                 constraints needed.  You'd need to teach it how to
>                 spill the special registers with the appropriate
>                 instructions, but that seems a lot more straight forward?
>
>                     9.Use recommendation
>
>                     Due to the shape configure issue, we recommend
>                     user to define the tile shape at the entry of the
>                     function entry and inline function as much as
>                     possible. The AMX instructions focus on
>                     computation instead of storage, so global variable
>                     for tile data is not recommended.
>
>                     Thanks
>
>                     Yuanke
>
>
>
>
>
>
>
>                     _______________________________________________
>
>                     LLVM Developers mailing list
>
>                     llvm-dev at lists.llvm.org  <mailto:llvm-dev at lists.llvm.org>
>
>                     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev  <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>
>
>
>
>             _______________________________________________
>
>             LLVM Developers mailing list
>
>             llvm-dev at lists.llvm.org  <mailto:llvm-dev at lists.llvm.org>
>
>             https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev  <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>         -- 
>
>         Hal Finkel
>
>         Lead, Compiler Technology and Programming Languages
>
>         Leadership Computing Facility
>
>         Argonne National Laboratory
>
>     -- 
>
>     Hal Finkel
>
>     Lead, Compiler Technology and Programming Languages
>
>     Leadership Computing Facility
>
>     Argonne National Laboratory
>
> -- 
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200819/224a6c60/attachment.html>