[llvm-dev] How to describe the RegisterInfo?

Tue Aug 23 10:32:14 PDT 2016

On Mon, Aug 22, 2016 at 09:46:10PM +0800, Ruiling Song via llvm-dev wrote:
> Hello Everyone,
> 
> I am trying to make a new LLVM backend target for Intel GPU.
> I would start from targeting OpenCL language first.
> But I am not quite familiar with LLVM backend infrastructure.
> I have some problem on describing the RegisterInfo.
> 
> Intel GPU launches lots of hardware threads to do GPGPU workload.
> Each hardware thread has 128 registers(r0-r127), with each one of size 32
> byte.
> Each hardware thread may run in SIMD 8/16/32 way, which maps to
> 8/16/32 OpenCL working items. And the SIMD width is chosen at
> compile time (normally chosen according to register pressure, bigger simd
> width means bigger register pressure).
> Note each instruction has each own exec-width, which may not be equal to
> program SIMD width.
> Normally we would allocate contiguous registers for divergent value.
> For example, we have a program compiled as SIMD 8, we need to allocate 4
> byte*8=32 byte
> value for a divergent float/i32 value. But if there is a 'short type' value,
> it only needs 2 byte*8=16 byte, that is half of a 32-byte-register.
> we may also allocate for 'uniform' value, a uniform value only needs
> type-sized register,
> without multiply 'simd-width'. A uniform float/i32 value only needs 4 byte
> physical register.
> Thus a 32-byte-register can hold up to 8 different uniform float/i32 values.
> 
> Some time we also need to access register in stride way. Like a bitcast
> from i64 to v2i32,
> we need to access the i64 register with horizontal stride of 2.
> Look below example, the i64 value is hold in r10 and r11. L/H stands for
> the low 32bit/high 32bit.
> And the simd width of the program is SIMD 8, so we have 8 pairs of L/H.
> r10: L H L H L H L H
> r11: L H L H L H L H
> below two instructions will extract the low 32bit and high 32bit part.
> mov(8 | M0) r12.0<1>, r10.0<8,4,2>:D
> mov(8 | M0) r13.0<1>, r10.1<8,4,2>:D
> (The format of a register region is RegNum.regSubNum<vertStride, width,
> horzStride>:type)
> (Note the regSubNum is measured in units of the register type here.)
> then r12/r13 contains the result vector components.
> You can refer below link for more details on Intel GPU assembly and
> register usage:
> https://software.intel.com/en-us/articles/introduction-to-gen-assembly
> 
> I notice the hardware encoding of a register is 16 bit. that is not enough
> to encode all the
> register region parameters(regNum, type, hstride, vstride, width,...) in
> RegisterInfo.td. And I am not sure
> which is the reasonable place to hold this stride/type/width information
> for a physical register.
> Maybe some other .cpp file is more suitable than RegisterInfo.td file?
> Because I need to change the register
> region parameters in the bitcast instruction( from qword with hstride 1 to
> dword with hstride 2)
> At which stage is suitable to do such bitcast instruction logic? after
> reg-alloc?
> 

Hi,

I would recommend encoding some of the register region parameters as part
of the instruction rather than using the register encoding, because
something like 'width' seems  more like a property of the instruction
than of the register to me.

-Tom

> The detailed hardware spec is located at:
> https://01.org/sites/default/files/documentation/intel-gfx-
> prm-osrc-bdw-vol07-3d_media_gpgpu_3.pdf
> at page 921, it describe the detailed instruction encode format.
> It needs (regFile, regNum, subRegNum, width, type, addrMode, hStride,
> vStride) to describe a register.
> 
> I have attached my first version RegisterInfo.td.
> And I also have a question about the attached RegisterInfo.td file. Do I
> have to define different SubRegIndex
> like below to make TableGen works correctly?
> 
> foreach Index = 0-15 in {
>  def subd#Index :SubRegIndex<32, !shl(Index, 5)>; //used as SubRegIndex
> when declaring gpr_d_simd8
>  def subw#Index: SubRegIndex<16, !shl(Index, 4)>; //used as SubRegIndex
> when declaring gpr_w_simd8
>  ...
> }
> 
> If anything I am not saying clear, just reply the mail. Thanks for any help!
> 
> Thanks!
> Ruiling

> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev