[www-releases] r372328 - Check in 9.0.0 source and docs

Hans Wennborg via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 19 07:32:55 PDT 2019


Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_attr.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_attr.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_attr.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_attr.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,30 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_attr:
+
+attr
+===========================
+
+Interpolation attribute and channel:
+
+    ============== ===================================
+    Syntax         Description
+    ============== ===================================
+    attr{0..32}.x  Attribute 0..32 with *x* channel.
+    attr{0..32}.y  Attribute 0..32 with *y* channel.
+    attr{0..32}.z  Attribute 0..32 with *z* channel.
+    attr{0..32}.w  Attribute 0..32 with *w* channel.
+    ============== ===================================
+
+Examples:
+
+.. parsed-literal::
+
+    v_interp_p1_f32 v1, v0, attr0.x
+    v_interp_p1_f32 v1, v0, attr32.w
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_base_smem_addr.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_base_smem_addr.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_base_smem_addr.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_base_smem_addr.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_base_smem_addr:
+
+sbase
+===========================
+
+A 64-bit base address for scalar memory operations.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_base_smem_buf.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_base_smem_buf.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_base_smem_buf.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_base_smem_buf.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_base_smem_buf:
+
+sbase
+===========================
+
+A 128-bit buffer resource constant for scalar memory operations which provides a base address, a size and a stride.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_bimm16.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_bimm16.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_bimm16.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_bimm16.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_bimm16:
+
+imm16
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_bimm32.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_bimm32.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_bimm32.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_bimm32.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_bimm32:
+
+imm32
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 32 bits.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_buf_atomic128.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_buf_atomic128.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_buf_atomic128.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_buf_atomic128.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_data_buf_atomic128:
+
+vdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* 4 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_buf_atomic32.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_buf_atomic32.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_buf_atomic32.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_buf_atomic32.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_data_buf_atomic32:
+
+vdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* 1 dword by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_buf_atomic64.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_buf_atomic64.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_buf_atomic64.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_buf_atomic64.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_data_buf_atomic64:
+
+vdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* 2 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_mimg_atomic_cmp.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_mimg_atomic_cmp.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_mimg_atomic_cmp.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_mimg_atomic_cmp.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,27 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_data_mimg_atomic_cmp:
+
+vdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` and :ref:`tfe<amdgpu_synid_tfe>`:
+
+* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
+* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+  Note. The surface data format is indicated in the image resource constant but not in the instruction.
+
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_mimg_atomic_reg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_mimg_atomic_reg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_mimg_atomic_reg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_mimg_atomic_reg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,26 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_data_mimg_atomic_reg:
+
+vdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` and :ref:`tfe<amdgpu_synid_tfe>`:
+
+* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
+* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+  Note. The surface data format is indicated in the image resource constant but not in the instruction.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_mimg_store.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_mimg_store.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_mimg_store.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_data_mimg_store.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,18 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_data_mimg_store:
+
+vdata
+===========================
+
+Image data to store by an *image_store* instruction.
+
+*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` which may specify from 1 to 4 data elements. Each data element occupies 1 dword.
+
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_buf_128.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_buf_128.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_buf_128.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_buf_128.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_dst_buf_128:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer.
+
+*Size:* 4 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_buf_64.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_buf_64.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_buf_64.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_buf_64.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_dst_buf_64:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer.
+
+*Size:* 2 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_buf_96.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_buf_96.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_buf_96.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_buf_96.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_dst_buf_96:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer.
+
+*Size:* 3 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_buf_lds.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_buf_lds.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_buf_lds.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_buf_lds.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_dst_buf_lds:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer.
+
+If :ref:`lds<amdgpu_synid_lds>` is specified, this operand is ignored by H/W and data are stored directly into LDS.
+
+*Size:* 1 dword by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+    Note that :ref:`tfe<amdgpu_synid_tfe>` and :ref:`lds<amdgpu_synid_lds>` cannot be used together.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_flat_atomic32.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_flat_atomic32.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_flat_atomic32.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_flat_atomic32.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,19 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_dst_flat_atomic32:
+
+vdst
+===========================
+
+Data returned by a 32-bit atomic flat instruction.
+
+This is an optional operand. It must be used if and only if :ref:`glc<amdgpu_synid_glc>` is specified.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_flat_atomic64.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_flat_atomic64.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_flat_atomic64.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_flat_atomic64.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,19 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_dst_flat_atomic64:
+
+vdst
+===========================
+
+Data returned by a 64-bit atomic flat instruction.
+
+This is an optional operand. It must be used if and only if :ref:`glc<amdgpu_synid_glc>` is specified.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_mimg_gather4.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_mimg_gather4.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_mimg_gather4.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_mimg_gather4.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_dst_mimg_gather4:
+
+vdst
+===========================
+
+Image data to load by an *image_gather4* instruction.
+
+*Size:* 4 data elements by default. Each data element occupies 1 dword. :ref:`tfe<amdgpu_synid_tfe>` adds one more dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_mimg_regular.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_mimg_regular.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_mimg_regular.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_dst_mimg_regular.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,20 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_dst_mimg_regular:
+
+vdst
+===========================
+
+Image data to load by an image instruction.
+
+*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` and :ref:`tfe<amdgpu_synid_tfe>`:
+
+* :ref:`dmask<amdgpu_synid_dmask>` may specify from 1 to 4 data elements. Each data element occupies 1 dword.
+* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_fimm32.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_fimm32.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_fimm32.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_fimm32.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_fimm32:
+
+imm32
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The value is converted to *f32* as described :ref:`here<amdgpu_synid_lit_conv>`.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_hwreg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_hwreg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_hwreg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_hwreg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,60 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_hwreg:
+
+hwreg
+===========================
+
+Bits of a hardware register being accessed.
+
+The bits of this operand have the following meaning:
+
+    ============ ===================================
+    Bits         Description
+    ============ ===================================
+    5:0          Register *id*.
+    10:6         First bit *offset* (0..31).
+    15:11        *Size* in bits (1..32).
+    ============ ===================================
+
+This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below.
+
+    ==================================== ============================================================================
+    Syntax                               Description
+    ==================================== ============================================================================
+    hwreg({0..63})                       All bits of a register indicated by its *id*.
+    hwreg(<*name*>)                      All bits of a register indicated by its *name*.
+    hwreg({0..63}, {0..31}, {1..32})     Register bits indicated by register *id*, first bit *offset* and *size*.
+    hwreg(<*name*>, {0..31}, {1..32})    Register bits indicated by register *name*, first bit *offset* and *size*.
+    ==================================== ============================================================================
+
+Register *id*, *offset* and *size* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`.
+
+Defined register *names* include:
+
+    =================== ==========================================
+    Name                Description
+    =================== ==========================================
+    HW_REG_MODE         Shader writeable mode bits.
+    HW_REG_STATUS       Shader read-only status.
+    HW_REG_TRAPSTS      Trap status.
+    HW_REG_HW_ID        Id of wave, simd, compute unit, etc.
+    HW_REG_GPR_ALLOC    Per-wave SGPR and VGPR allocation.
+    HW_REG_LDS_ALLOC    Per-wave LDS allocation.
+    HW_REG_IB_STS       Counters of outstanding instructions.
+    =================== ==========================================
+
+Examples:
+
+.. parsed-literal::
+
+    s_getreg_b32 s2, 0x6
+    s_getreg_b32 s2, hwreg(15)
+    s_getreg_b32 s2, hwreg(51, 1, 31)
+    s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1)
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_label.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_label.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_label.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_label.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,30 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_label:
+
+label
+===========================
+
+A branch target which is a 16-bit signed integer treated as a PC-relative dword offset.
+
+This operand may be specified as:
+
+* An :ref:`integer_number<amdgpu_synid_integer_number>`. The number is truncated to 16 bits.
+* An :ref:`absolute_expression<amdgpu_synid_absolute_expression>` which must start with an :ref:`integer_number<amdgpu_synid_integer_number>`. The value of the expression is truncated to 16 bits.
+* A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label). The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
+
+Examples:
+
+.. parsed-literal::
+
+  offset = 30
+  s_branch loop_end
+  s_branch 2 + offset
+  s_branch 32
+  loop_end:
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_mod.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_mod.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_mod.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_mod.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_mod:
+
+m
+===========================
+
+This operand may be used with floating point operand modifiers :ref:`abs<amdgpu_synid_abs>` and :ref:`neg<amdgpu_synid_neg>`.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_msg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_msg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_msg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_msg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,72 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_msg:
+
+msg
+===========================
+
+A 16-bit message code. The bits of this operand have the following meaning:
+
+    ============ ======================================================
+    Bits         Description
+    ============ ======================================================
+    3:0          Message *type*.
+    6:4          Optional *operation*.
+    9:7          Optional *parameters*.
+    15:10        Unused.
+    ============ ======================================================
+
+This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below:
+
+    ======================================== ========================================================================
+    Syntax                                   Description
+    ======================================== ========================================================================
+    sendmsg(<*type*>)                        A message identified by its *type*.
+    sendmsg(<*type*>, <*op*>)                A message identified by its *type* and *operation*.
+    sendmsg(<*type*>, <*op*>, <*stream*>)    A message identified by its *type* and *operation* with a stream *id*.
+    ======================================== ========================================================================
+
+*Type* may be specified using message *name* or message *id*.
+
+*Op* may be specified using operation *name* or operation *id*.
+
+Stream *id* is an integer in the range 0..3.
+
+Message *id*, operation *id* and stream *id* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`.
+
+Each message type supports specific operations:
+
+    ================= ========== ============================== ============ ==========
+    Message name      Message Id Supported Operations           Operation Id Stream Id
+    ================= ========== ============================== ============ ==========
+    MSG_INTERRUPT     1          \-                             \-           \-
+    MSG_GS            2          GS_OP_CUT                      1            Optional
+    \                            GS_OP_EMIT                     2            Optional
+    \                            GS_OP_EMIT_CUT                 3            Optional
+    MSG_GS_DONE       3          GS_OP_NOP                      0            \-
+    \                            GS_OP_CUT                      1            Optional
+    \                            GS_OP_EMIT                     2            Optional
+    \                            GS_OP_EMIT_CUT                 3            Optional
+    MSG_SYSMSG        15         SYSMSG_OP_ECC_ERR_INTERRUPT    1            \-
+    \                            SYSMSG_OP_REG_RD               2            \-
+    \                            SYSMSG_OP_HOST_TRAP_ACK        3            \-
+    \                            SYSMSG_OP_TTRACE_PC            4            \-
+    ================= ========== ============================== ============ ==========
+
+Examples:
+
+.. parsed-literal::
+
+    s_sendmsg 0x12
+    s_sendmsg sendmsg(MSG_INTERRUPT)
+    s_sendmsg sendmsg(2, GS_OP_CUT)
+    s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT)
+    s_sendmsg sendmsg(MSG_GS, 2)
+    s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1)
+    s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC)
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_offset_buf.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_offset_buf.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_offset_buf.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_offset_buf.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_offset_buf:
+
+soffset
+===========================
+
+An unsigned byte offset.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_offset_smem.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_offset_smem.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_offset_smem.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_offset_smem.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_offset_smem:
+
+soffset
+===========================
+
+An unsigned offset added to the base address to get memory address.
+
+* If offset is specified as a register, it supplies an unsigned byte offset but 2 lsb's are ignored.
+* If offset is specified as an :ref:`uimm32<amdgpu_synid_uimm32>`, it supplies a 32-bit unsigned byte offset but 2 lsb's are ignored.
+* If offset is specified as an :ref:`uimm8<amdgpu_synid_uimm8>`, it supplies an 8-bit unsigned dword offset.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`uimm8<amdgpu_synid_uimm8>`, :ref:`uimm32<amdgpu_synid_uimm32>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_opt.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_opt.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_opt.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_opt.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_opt:
+
+opt
+===========================
+
+This is an optional operand. It must be used if and only if :ref:`glc<amdgpu_synid_glc>` is specified.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_param.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_param.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_param.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_param.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,22 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_param:
+
+param
+===========================
+
+Interpolation parameter to read:
+
+    ============ ===================================
+    Syntax       Description
+    ============ ===================================
+    p0           Parameter *P0*.
+    p10          Parameter *P10*.
+    p20          Parameter *P20*.
+    ============ ===================================
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ret.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ret.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ret.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ret.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_ret:
+
+dst
+===========================
+
+This is an input operand. It may optionally serve as a destination if :ref:`glc<amdgpu_synid_glc>` is specified.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_rsrc_buf.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_rsrc_buf.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_rsrc_buf.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_rsrc_buf.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_rsrc_buf:
+
+srsrc
+===========================
+
+Buffer resource constant which defines the address and characteristics of the buffer in memory.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_rsrc_mimg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_rsrc_mimg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_rsrc_mimg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_rsrc_mimg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_rsrc_mimg:
+
+srsrc
+===========================
+
+Image resource constant which defines the location of the image buffer in memory, its dimensions, tiling, and data format.
+
+*Size:* 8 dwords by default, 4 dwords if :ref:`r128<amdgpu_synid_r128>` is specified.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_samp_mimg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_samp_mimg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_samp_mimg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_samp_mimg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_samp_mimg:
+
+ssamp
+===========================
+
+Sampler constant used to specify filtering options applied to the image data after it is read.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst128_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst128_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst128_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst128_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_sdst128_0:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst256_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst256_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst256_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst256_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_sdst256_0:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 8 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_sdst32_0:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst32_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst32_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst32_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst32_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_sdst32_1:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst32_2.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst32_2.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst32_2.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst32_2.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_sdst32_2:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`trap<amdgpu_synid_trap>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst512_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst512_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst512_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst512_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_sdst512_0:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 16 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_sdst64_0:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst64_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst64_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst64_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_sdst64_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_sdst64_1:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`exec<amdgpu_synid_exec>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_simm16.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_simm16.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_simm16.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_simm16.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_simm16:
+
+imm16
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then sign-extended to 32 bits.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_src32_0:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`lds_direct<amdgpu_synid_lds_direct>`, :ref:`constant<amdgpu_synid_constant>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_src32_1:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`lds_direct<amdgpu_synid_lds_direct>`, :ref:`iconst<amdgpu_synid_iconst>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_2.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_2.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_2.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_2.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_src32_2:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_3.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_3.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_3.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_3.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_src32_3:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`lds_direct<amdgpu_synid_lds_direct>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_4.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_4.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_4.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_4.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_src32_4:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_5.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_5.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_5.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_5.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_src32_5:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`lds_direct<amdgpu_synid_lds_direct>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_6.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_6.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_6.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src32_6.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_src32_6:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`iconst<amdgpu_synid_iconst>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_src64_0:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src64_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src64_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src64_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src64_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_src64_1:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src64_2.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src64_2.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src64_2.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src64_2.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_src64_2:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`iconst<amdgpu_synid_iconst>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src_exp.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src_exp.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src_exp.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_src_exp.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,28 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_src_exp:
+
+vsrc
+===========================
+
+Data to copy to export buffers. This is an optional operand. Must be specified as :ref:`off<amdgpu_synid_off>` if not used.
+
+:ref:`compr<amdgpu_synid_compr>` modifier indicates use of compressed (16-bit) data. This limits number of source operands from 4 to 2:
+
+* src0 and src1 must specify the first register (or :ref:`off<amdgpu_synid_off>`).
+* src2 and src3 must specify the second register (or :ref:`off<amdgpu_synid_off>`).
+
+An example:
+
+.. parsed-literal::
+
+  exp mrtz v3, v3, off, off compr
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`off<amdgpu_synid_off>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_ssrc32_0:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_ssrc32_1:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_2.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_2.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_2.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_2.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_ssrc32_2:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_3.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_3.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_3.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_3.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_ssrc32_3:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`iconst<amdgpu_synid_iconst>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_4.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_4.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_4.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_4.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_ssrc32_4:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_5.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_5.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_5.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_5.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_ssrc32_5:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`iconst<amdgpu_synid_iconst>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_6.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_6.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_6.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc32_6.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_ssrc32_6:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`lds_direct<amdgpu_synid_lds_direct>`, :ref:`constant<amdgpu_synid_constant>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_ssrc64_0:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc64_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc64_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc64_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc64_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_ssrc64_1:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc64_2.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc64_2.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc64_2.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc64_2.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_ssrc64_2:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc64_3.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc64_3.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc64_3.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_ssrc64_3.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_ssrc64_3:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`exec<amdgpu_synid_exec>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_tgt.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_tgt.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_tgt.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_tgt.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,24 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_tgt:
+
+tgt
+===========================
+
+An export target:
+
+    ============== ===================================
+    Syntax         Description
+    ============== ===================================
+    pos{0..3}      Copy vertex position 0..3.
+    param{0..31}   Copy vertex parameter 0..31.
+    mrt{0..7}      Copy pixel color to the MRTs 0..7.
+    mrtz           Copy pixel depth (Z) data.
+    null           Copy nothing.
+    ============== ===================================
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_type_dev.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_type_dev.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_type_dev.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_type_dev.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_type_dev:
+
+Type deviation
+===========================
+
+*Type* of this operand differs from *type* :ref:`implied by the opcode<amdgpu_syn_instruction_type>`. This tag specifies actual operand *type*.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_uimm16.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_uimm16.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_uimm16.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_uimm16.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_uimm16:
+
+imm16
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then zero-extended to 32 bits.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vcc_64.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vcc_64.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vcc_64.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vcc_64.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_vcc_64:
+
+vcc
+===========================
+
+Vector condition code.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`vcc<amdgpu_synid_vcc>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdata128_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdata128_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdata128_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdata128_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_vdata128_0:
+
+vdata
+===========================
+
+Instruction input.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdata32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdata32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdata32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdata32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_vdata32_0:
+
+vdata
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdata64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdata64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdata64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdata64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_vdata64_0:
+
+vdata
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdata96_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdata96_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdata96_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdata96_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_vdata96_0:
+
+vdata
+===========================
+
+Instruction input.
+
+*Size:* 3 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdst128_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdst128_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdst128_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdst128_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_vdst128_0:
+
+vdst
+===========================
+
+Instruction output.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdst32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdst32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdst32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdst32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_vdst32_0:
+
+vdst
+===========================
+
+Instruction output.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdst64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdst64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdst64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdst64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_vdst64_0:
+
+vdst
+===========================
+
+Instruction output.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdst96_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdst96_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdst96_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vdst96_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_vdst96_0:
+
+vdst
+===========================
+
+Instruction output.
+
+*Size:* 3 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vsrc128_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vsrc128_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vsrc128_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vsrc128_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_vsrc128_0:
+
+vsrc
+===========================
+
+Instruction input.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vsrc32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vsrc32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vsrc32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vsrc32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_vsrc32_0:
+
+vsrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vsrc32_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vsrc32_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vsrc32_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vsrc32_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_vsrc32_1:
+
+vsrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`lds_direct<amdgpu_synid_lds_direct>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vsrc64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vsrc64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vsrc64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_vsrc64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_vsrc64_0:
+
+vsrc
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_waitcnt.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_waitcnt.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_waitcnt.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx7_waitcnt.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,55 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid7_waitcnt:
+
+waitcnt
+===========================
+
+Counts of outstanding instructions to wait for.
+
+The bits of this operand have the following meaning:
+
+    ============ ======================================================
+    Bits         Description
+    ============ ======================================================
+    3:0          VM_CNT: vector memory operations count.
+    6:4          EXP_CNT: export count.
+    12:8         LGKM_CNT: LDS, GDS, Constant and Message count.
+    ============ ======================================================
+
+This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>`
+or as a combination of the following symbolic helpers:
+
+    ====================== ======================================================================
+    Syntax                 Description
+    ====================== ======================================================================
+    vmcnt(<*N*>)           VM_CNT value. *N* must not exceed the largest VM_CNT value.
+    expcnt(<*N*>)          EXP_CNT value. *N* must not exceed the largest EXP_CNT value.
+    lgkmcnt(<*N*>)         LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value.
+    vmcnt_sat(<*N*>)       VM_CNT value computed as min(*N*, the largest VM_CNT value).
+    expcnt_sat(<*N*>)      EXP_CNT value computed as min(*N*, the largest EXP_CNT value).
+    lgkmcnt_sat(<*N*>)     LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value).
+    ====================== ======================================================================
+
+These helpers may be specified in any order. Ampersands and commas may be used as optional separators.
+
+*N* is either an
+:ref:`integer number<amdgpu_synid_integer_number>` or an
+:ref:`absolute expression<amdgpu_synid_absolute_expression>`.
+
+Examples:
+
+.. parsed-literal::
+
+    s_waitcnt 0
+    s_waitcnt vmcnt(1)
+    s_waitcnt expcnt(2) lgkmcnt(3)
+    s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3)
+    s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3)
+    s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2)
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_addr_buf.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_addr_buf.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_addr_buf.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_addr_buf.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,22 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_addr_buf:
+
+vaddr
+===========================
+
+This is an optional operand which may specify offset and/or index.
+
+*Size:* 0, 1 or 2 dwords. Size is controlled by modifiers :ref:`offen<amdgpu_synid_offen>` and :ref:`idxen<amdgpu_synid_idxen>`:
+
+* If only :ref:`idxen<amdgpu_synid_idxen>` is specified, this operand supplies an index. Size is 1 dword.
+* If only :ref:`offen<amdgpu_synid_offen>` is specified, this operand supplies an offset. Size is 1 dword.
+* If both modifiers are specified, index is in the first register and offset is in the second. Size is 2 dwords.
+* If none of these modifiers are specified, this operand must be set to :ref:`off<amdgpu_synid_off>`.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`off<amdgpu_synid_off>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_addr_ds.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_addr_ds.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_addr_ds.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_addr_ds.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_addr_ds:
+
+vaddr
+===========================
+
+An offset from the start of GDS/LDS memory.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_addr_flat.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_addr_flat.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_addr_flat.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_addr_flat.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_addr_flat:
+
+vaddr
+===========================
+
+A 64-bit flat address.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_addr_mimg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_addr_mimg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_addr_mimg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_addr_mimg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_addr_mimg:
+
+vaddr
+===========================
+
+Image address which includes from one to four dimensional coordinates and other data used to locate a position in the image.
+
+*Size:* 1, 2, 3, 4, 8 or 16 dwords. Actual size depends on opcode and specific image being handled.
+
+    Note 1. Image format and dimensions are encoded in the image resource constant but not in the instruction.
+
+    Note 2. Actually image address size may vary from 1 to 13 dwords, but assembler currently supports a limited range of register sequences.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_attr.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_attr.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_attr.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_attr.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,30 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_attr:
+
+attr
+===========================
+
+Interpolation attribute and channel:
+
+    ============== ===================================
+    Syntax         Description
+    ============== ===================================
+    attr{0..32}.x  Attribute 0..32 with *x* channel.
+    attr{0..32}.y  Attribute 0..32 with *y* channel.
+    attr{0..32}.z  Attribute 0..32 with *z* channel.
+    attr{0..32}.w  Attribute 0..32 with *w* channel.
+    ============== ===================================
+
+Examples:
+
+.. parsed-literal::
+
+    v_interp_p1_f32 v1, v0, attr0.x
+    v_interp_p1_f32 v1, v0, attr32.w
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_base_smem_addr.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_base_smem_addr.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_base_smem_addr.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_base_smem_addr.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_base_smem_addr:
+
+sbase
+===========================
+
+A 64-bit base address for scalar memory operations.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_base_smem_buf.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_base_smem_buf.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_base_smem_buf.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_base_smem_buf.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_base_smem_buf:
+
+sbase
+===========================
+
+A 128-bit buffer resource constant for scalar memory operations which provides a base address, a size and a stride.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_bimm16.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_bimm16.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_bimm16.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_bimm16.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_bimm16:
+
+imm16
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_bimm32.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_bimm32.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_bimm32.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_bimm32.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_bimm32:
+
+imm32
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 32 bits.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_atomic128.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_atomic128.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_atomic128.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_atomic128.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_data_buf_atomic128:
+
+vdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* 4 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_atomic32.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_atomic32.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_atomic32.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_atomic32.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_data_buf_atomic32:
+
+vdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* 1 dword by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_atomic64.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_atomic64.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_atomic64.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_atomic64.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_data_buf_atomic64:
+
+vdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* 2 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_d16_128.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_d16_128.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_d16_128.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_d16_128.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,20 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_data_buf_d16_128:
+
+vdata
+===========================
+
+16-bit data to store by a buffer instruction.
+
+*Size:* depends on GFX8 GPU revision:
+
+* 4 dwords for GFX8.0. This H/W supports no packing.
+* 2 dwords for GFX8.1+. This H/W supports data packing.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_d16_32.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_d16_32.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_d16_32.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_d16_32.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_data_buf_d16_32:
+
+vdata
+===========================
+
+16-bit data to store by a buffer instruction.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_d16_64.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_d16_64.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_d16_64.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_d16_64.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,20 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_data_buf_d16_64:
+
+vdata
+===========================
+
+16-bit data to store by a buffer instruction.
+
+*Size:* depends on GFX8 GPU revision:
+
+* 2 dwords for GFX8.0. This H/W supports no packing.
+* 1 dword for GFX8.1+. This H/W supports data packing.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_d16_96.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_d16_96.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_d16_96.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_buf_d16_96.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,20 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_data_buf_d16_96:
+
+vdata
+===========================
+
+16-bit data to store by a buffer instruction.
+
+*Size:* depends on GFX8 GPU revision:
+
+* 3 dwords for GFX8.0. This H/W supports no packing.
+* 2 dwords for GFX8.1+. This H/W supports data packing.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_mimg_atomic_cmp.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_mimg_atomic_cmp.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_mimg_atomic_cmp.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_mimg_atomic_cmp.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,27 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_data_mimg_atomic_cmp:
+
+vdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` and :ref:`tfe<amdgpu_synid_tfe>`:
+
+* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
+* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+  Note. The surface data format is indicated in the image resource constant but not in the instruction.
+
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_mimg_atomic_reg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_mimg_atomic_reg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_mimg_atomic_reg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_mimg_atomic_reg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,26 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_data_mimg_atomic_reg:
+
+vdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` and :ref:`tfe<amdgpu_synid_tfe>`:
+
+* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
+* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+  Note. The surface data format is indicated in the image resource constant but not in the instruction.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_mimg_store.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_mimg_store.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_mimg_store.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_mimg_store.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,18 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_data_mimg_store:
+
+vdata
+===========================
+
+Image data to store by an *image_store* instruction.
+
+*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` which may specify from 1 to 4 data elements. Each data element occupies 1 dword.
+
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_mimg_store_d16.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_mimg_store_d16.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_mimg_store_d16.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_data_mimg_store_d16.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,24 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_data_mimg_store_d16:
+
+vdata
+===========================
+
+Image data to store by an *image_store* instruction.
+
+*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` and :ref:`d16<amdgpu_synid_d16>`:
+
+* :ref:`dmask<amdgpu_synid_dmask>` may specify from 1 to 4 data elements. Each data element occupies either 32 bits or 16 bits depending on :ref:`d16<amdgpu_synid_d16>`.
+* :ref:`d16<amdgpu_synid_d16>` has different meaning for GFX8.0 and GFX8.1:
+
+  * For GFX8.0 this modifier does not affect size of data elements in registers. Data in registers are stored in low 16 bits, high 16 bits are unused. There is no packing.
+  * Starting from GFX8.1 this modifier specifies that data elements in registers are packed; each value occupies 16 bits.
+
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_128.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_128.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_128.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_128.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_dst_buf_128:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer.
+
+*Size:* 4 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_64.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_64.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_64.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_64.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_dst_buf_64:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer.
+
+*Size:* 2 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_96.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_96.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_96.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_96.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_dst_buf_96:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer.
+
+*Size:* 3 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_d16_128.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_d16_128.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_d16_128.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_d16_128.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_dst_buf_d16_128:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer and converted to a 16-bit format.
+
+*Size:* depends on GFX8 GPU revision and :ref:`tfe<amdgpu_synid_tfe>`:
+
+* 4 dwords for GFX8.0. This H/W supports no packing.
+* 2 dwords for GFX8.1+. This H/W supports data packing.
+* :ref:`tfe<amdgpu_synid_tfe>` adds one dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_d16_32.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_d16_32.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_d16_32.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_d16_32.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_dst_buf_d16_32:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer and converted to a 16-bit format.
+
+*Size:* 1 dword by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_d16_64.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_d16_64.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_d16_64.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_d16_64.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_dst_buf_d16_64:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer and converted to a 16-bit format.
+
+*Size:* depends on GFX8 GPU revision and :ref:`tfe<amdgpu_synid_tfe>`:
+
+* 2 dwords for GFX8.0. This H/W supports no packing.
+* 1 dword for GFX8.1+. This H/W supports data packing.
+* :ref:`tfe<amdgpu_synid_tfe>` adds one dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_d16_96.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_d16_96.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_d16_96.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_d16_96.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_dst_buf_d16_96:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer and converted to a 16-bit format.
+
+*Size:* depends on GFX8 GPU revision and :ref:`tfe<amdgpu_synid_tfe>`:
+
+* 3 dwords for GFX8.0. This H/W supports no packing.
+* 2 dwords for GFX8.1+. This H/W supports data packing.
+* :ref:`tfe<amdgpu_synid_tfe>` adds one dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_lds.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_lds.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_lds.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_buf_lds.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_dst_buf_lds:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer.
+
+If :ref:`lds<amdgpu_synid_lds>` is specified, this operand is ignored by H/W and data are stored directly into LDS.
+
+*Size:* 1 dword by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+    Note that :ref:`tfe<amdgpu_synid_tfe>` and :ref:`lds<amdgpu_synid_lds>` cannot be used together.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_flat_atomic32.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_flat_atomic32.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_flat_atomic32.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_flat_atomic32.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,19 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_dst_flat_atomic32:
+
+vdst
+===========================
+
+Data returned by a 32-bit atomic flat instruction.
+
+This is an optional operand. It must be used if and only if :ref:`glc<amdgpu_synid_glc>` is specified.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_flat_atomic64.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_flat_atomic64.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_flat_atomic64.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_flat_atomic64.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,19 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_dst_flat_atomic64:
+
+vdst
+===========================
+
+Data returned by a 64-bit atomic flat instruction.
+
+This is an optional operand. It must be used if and only if :ref:`glc<amdgpu_synid_glc>` is specified.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_mimg_gather4.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_mimg_gather4.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_mimg_gather4.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_mimg_gather4.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,26 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_dst_mimg_gather4:
+
+vdst
+===========================
+
+Image data to load by an *image_gather4* instruction.
+
+*Size:* 4 data elements by default. Each data element occupies either 32 bits or 16 bits depending on :ref:`d16<amdgpu_synid_d16>`.
+
+:ref:`d16<amdgpu_synid_d16>` and :ref:`tfe<amdgpu_synid_tfe>` affect operand size as follows:
+
+* :ref:`d16<amdgpu_synid_d16>` has different meaning for GFX8.0 and GFX8.1:
+
+  * For GFX8.0 this modifier does not affect size of data elements in registers. Data in registers are stored in low 16 bits, high 16 bits are unused. There is no packing.
+  * Starting from GFX8.1 this modifier specifies that data elements in registers are packed; each value occupies 16 bits.
+
+* :ref:`tfe<amdgpu_synid_tfe>` adds one dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_mimg_regular.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_mimg_regular.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_mimg_regular.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_mimg_regular.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,20 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_dst_mimg_regular:
+
+vdst
+===========================
+
+Image data to load by an image instruction.
+
+*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` and :ref:`tfe<amdgpu_synid_tfe>`:
+
+* :ref:`dmask<amdgpu_synid_dmask>` may specify from 1 to 4 data elements. Each data element occupies 1 dword.
+* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_mimg_regular_d16.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_mimg_regular_d16.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_mimg_regular_d16.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_dst_mimg_regular_d16.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,26 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_dst_mimg_regular_d16:
+
+vdst
+===========================
+
+Image data to load by an image instruction.
+
+*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>`, :ref:`tfe<amdgpu_synid_tfe>` and :ref:`d16<amdgpu_synid_d16>`:
+
+* :ref:`dmask<amdgpu_synid_dmask>` may specify from 1 to 4 data elements. Each data element occupies either 32 bits or 16 bits depending on :ref:`d16<amdgpu_synid_d16>`.
+* :ref:`d16<amdgpu_synid_d16>` has different meaning for GFX8.0 and GFX8.1:
+
+  * For GFX8.0 this modifier does not affect size of data elements in registers. Data in registers are stored in low 16 bits, high 16 bits are unused. There is no packing.
+  * Starting from GFX8.1 this modifier specifies that data elements in registers are packed; each value occupies 16 bits.
+
+* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_fimm16.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_fimm16.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_fimm16.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_fimm16.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_fimm16:
+
+imm32
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The number is converted to *f16* as described :ref:`here<amdgpu_synid_lit_conv>`.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_fimm32.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_fimm32.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_fimm32.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_fimm32.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_fimm32:
+
+imm32
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The value is converted to *f32* as described :ref:`here<amdgpu_synid_lit_conv>`.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_hwreg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_hwreg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_hwreg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_hwreg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,60 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_hwreg:
+
+hwreg
+===========================
+
+Bits of a hardware register being accessed.
+
+The bits of this operand have the following meaning:
+
+    ============ ===================================
+    Bits         Description
+    ============ ===================================
+    5:0          Register *id*.
+    10:6         First bit *offset* (0..31).
+    15:11        *Size* in bits (1..32).
+    ============ ===================================
+
+This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below.
+
+    ==================================== ============================================================================
+    Syntax                               Description
+    ==================================== ============================================================================
+    hwreg({0..63})                       All bits of a register indicated by its *id*.
+    hwreg(<*name*>)                      All bits of a register indicated by its *name*.
+    hwreg({0..63}, {0..31}, {1..32})     Register bits indicated by register *id*, first bit *offset* and *size*.
+    hwreg(<*name*>, {0..31}, {1..32})    Register bits indicated by register *name*, first bit *offset* and *size*.
+    ==================================== ============================================================================
+
+Register *id*, *offset* and *size* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`.
+
+Defined register *names* include:
+
+    =================== ==========================================
+    Name                Description
+    =================== ==========================================
+    HW_REG_MODE         Shader writeable mode bits.
+    HW_REG_STATUS       Shader read-only status.
+    HW_REG_TRAPSTS      Trap status.
+    HW_REG_HW_ID        Id of wave, simd, compute unit, etc.
+    HW_REG_GPR_ALLOC    Per-wave SGPR and VGPR allocation.
+    HW_REG_LDS_ALLOC    Per-wave LDS allocation.
+    HW_REG_IB_STS       Counters of outstanding instructions.
+    =================== ==========================================
+
+Examples:
+
+.. parsed-literal::
+
+    s_getreg_b32 s2, 0x6
+    s_getreg_b32 s2, hwreg(15)
+    s_getreg_b32 s2, hwreg(51, 1, 31)
+    s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1)
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_imm4.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_imm4.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_imm4.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_imm4.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,25 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_imm4:
+
+imm4
+===========================
+
+A positive :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 4 bits.
+
+This operand is a mask which controls indexing mode for operands of subsequent instructions. Value 1 enables indexing and value 0 disables it.
+
+    ============ ========================================
+    Bit          Meaning
+    ============ ========================================
+    0            Enables or disables *src0* indexing.
+    1            Enables or disables *src1* indexing.
+    2            Enables or disables *src2* indexing.
+    3            Enables or disables *dst* indexing.
+    ============ ========================================
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_label.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_label.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_label.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_label.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,30 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_label:
+
+label
+===========================
+
+A branch target which is a 16-bit signed integer treated as a PC-relative dword offset.
+
+This operand may be specified as:
+
+* An :ref:`integer_number<amdgpu_synid_integer_number>`. The number is truncated to 16 bits.
+* An :ref:`absolute_expression<amdgpu_synid_absolute_expression>` which must start with an :ref:`integer_number<amdgpu_synid_integer_number>`. The value of the expression is truncated to 16 bits.
+* A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label). The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
+
+Examples:
+
+.. parsed-literal::
+
+  offset = 30
+  s_branch loop_end
+  s_branch 2 + offset
+  s_branch 32
+  loop_end:
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_mod_dpp_sdwa_abs_neg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_mod_dpp_sdwa_abs_neg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_mod_dpp_sdwa_abs_neg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_mod_dpp_sdwa_abs_neg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_mod_dpp_sdwa_abs_neg:
+
+m
+===========================
+
+This operand may be used with floating point operand modifiers :ref:`abs<amdgpu_synid_abs>` and :ref:`neg<amdgpu_synid_neg>`.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_mod_sdwa_sext.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_mod_sdwa_sext.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_mod_sdwa_sext.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_mod_sdwa_sext.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_mod_sdwa_sext:
+
+m
+===========================
+
+This operand may be used with integer operand modifier :ref:`sext<amdgpu_synid_sext>`.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_mod_vop3_abs_neg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_mod_vop3_abs_neg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_mod_vop3_abs_neg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_mod_vop3_abs_neg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_mod_vop3_abs_neg:
+
+m
+===========================
+
+This operand may be used with floating point operand modifiers :ref:`abs<amdgpu_synid_abs>` and :ref:`neg<amdgpu_synid_neg>`.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_msg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_msg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_msg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_msg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,72 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_msg:
+
+msg
+===========================
+
+A 16-bit message code. The bits of this operand have the following meaning:
+
+    ============ ======================================================
+    Bits         Description
+    ============ ======================================================
+    3:0          Message *type*.
+    6:4          Optional *operation*.
+    9:7          Optional *parameters*.
+    15:10        Unused.
+    ============ ======================================================
+
+This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below:
+
+    ======================================== ========================================================================
+    Syntax                                   Description
+    ======================================== ========================================================================
+    sendmsg(<*type*>)                        A message identified by its *type*.
+    sendmsg(<*type*>, <*op*>)                A message identified by its *type* and *operation*.
+    sendmsg(<*type*>, <*op*>, <*stream*>)    A message identified by its *type* and *operation* with a stream *id*.
+    ======================================== ========================================================================
+
+*Type* may be specified using message *name* or message *id*.
+
+*Op* may be specified using operation *name* or operation *id*.
+
+Stream *id* is an integer in the range 0..3.
+
+Message *id*, operation *id* and stream *id* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`.
+
+Each message type supports specific operations:
+
+    ================= ========== ============================== ============ ==========
+    Message name      Message Id Supported Operations           Operation Id Stream Id
+    ================= ========== ============================== ============ ==========
+    MSG_INTERRUPT     1          \-                             \-           \-
+    MSG_GS            2          GS_OP_CUT                      1            Optional
+    \                            GS_OP_EMIT                     2            Optional
+    \                            GS_OP_EMIT_CUT                 3            Optional
+    MSG_GS_DONE       3          GS_OP_NOP                      0            \-
+    \                            GS_OP_CUT                      1            Optional
+    \                            GS_OP_EMIT                     2            Optional
+    \                            GS_OP_EMIT_CUT                 3            Optional
+    MSG_SYSMSG        15         SYSMSG_OP_ECC_ERR_INTERRUPT    1            \-
+    \                            SYSMSG_OP_REG_RD               2            \-
+    \                            SYSMSG_OP_HOST_TRAP_ACK        3            \-
+    \                            SYSMSG_OP_TTRACE_PC            4            \-
+    ================= ========== ============================== ============ ==========
+
+Examples:
+
+.. parsed-literal::
+
+    s_sendmsg 0x12
+    s_sendmsg sendmsg(MSG_INTERRUPT)
+    s_sendmsg sendmsg(2, GS_OP_CUT)
+    s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT)
+    s_sendmsg sendmsg(MSG_GS, 2)
+    s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1)
+    s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC)
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_offset_buf.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_offset_buf.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_offset_buf.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_offset_buf.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_offset_buf:
+
+soffset
+===========================
+
+An unsigned byte offset.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_offset_smem_load.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_offset_smem_load.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_offset_smem_load.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_offset_smem_load.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_offset_smem_load:
+
+soffset
+===========================
+
+An unsigned byte offset added to the base address to get memory address.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`uimm20<amdgpu_synid_uimm20>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_offset_smem_store.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_offset_smem_store.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_offset_smem_store.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_offset_smem_store.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_offset_smem_store:
+
+soffset
+===========================
+
+An unsigned byte offset added to the base address to get memory address.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`m0<amdgpu_synid_m0>`, :ref:`uimm20<amdgpu_synid_uimm20>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_opt.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_opt.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_opt.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_opt.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_opt:
+
+opt
+===========================
+
+This is an optional operand. It must be used if and only if :ref:`glc<amdgpu_synid_glc>` is specified.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_param.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_param.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_param.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_param.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,22 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_param:
+
+param
+===========================
+
+Interpolation parameter to read:
+
+    ============ ===================================
+    Syntax       Description
+    ============ ===================================
+    p0           Parameter *P0*.
+    p10          Parameter *P10*.
+    p20          Parameter *P20*.
+    ============ ===================================
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_perm_smem.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_perm_smem.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_perm_smem.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_perm_smem.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,24 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_perm_smem:
+
+imm3
+===========================
+
+A bit mask which indicates request permissions.
+
+This operand must be specified as an :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 7 bits, but only 3 low bits are significant.
+
+    ============ ==============================
+    Bit Number   Description
+    ============ ==============================
+    0            Request *read* permission.
+    1            Request *write* permission.
+    2            Request *execute* permission.
+    ============ ==============================
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ret.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ret.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ret.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ret.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_ret:
+
+dst
+===========================
+
+This is an input operand. It may optionally serve as a destination if :ref:`glc<amdgpu_synid_glc>` is specified.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_rsrc_buf.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_rsrc_buf.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_rsrc_buf.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_rsrc_buf.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_rsrc_buf:
+
+srsrc
+===========================
+
+Buffer resource constant which defines the address and characteristics of the buffer in memory.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_rsrc_mimg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_rsrc_mimg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_rsrc_mimg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_rsrc_mimg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_rsrc_mimg:
+
+srsrc
+===========================
+
+Image resource constant which defines the location of the image buffer in memory, its dimensions, tiling, and data format.
+
+*Size:* 8 dwords by default, 4 dwords if :ref:`r128<amdgpu_synid_r128>` is specified.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_samp_mimg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_samp_mimg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_samp_mimg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_samp_mimg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_samp_mimg:
+
+ssamp
+===========================
+
+Sampler constant used to specify filtering options applied to the image data after it is read.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdata128_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdata128_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdata128_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdata128_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_sdata128_0:
+
+sdata
+===========================
+
+Instruction input.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdata32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdata32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdata32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdata32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_sdata32_0:
+
+sdata
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdata64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdata64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdata64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdata64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_sdata64_0:
+
+sdata
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst128_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst128_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst128_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst128_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_sdst128_0:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst256_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst256_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst256_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst256_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_sdst256_0:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 8 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_sdst32_0:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst32_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst32_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst32_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst32_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_sdst32_1:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst32_2.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst32_2.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst32_2.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst32_2.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_sdst32_2:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`trap<amdgpu_synid_trap>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst512_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst512_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst512_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst512_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_sdst512_0:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 16 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_sdst64_0:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst64_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst64_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst64_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_sdst64_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_sdst64_1:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`exec<amdgpu_synid_exec>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_simm16.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_simm16.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_simm16.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_simm16.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_simm16:
+
+imm16
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then sign-extended to 32 bits.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_src32_0:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`lds_direct<amdgpu_synid_lds_direct>`, :ref:`constant<amdgpu_synid_constant>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src32_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src32_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src32_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src32_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_src32_1:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src32_2.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src32_2.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src32_2.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src32_2.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_src32_2:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`lds_direct<amdgpu_synid_lds_direct>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src32_3.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src32_3.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src32_3.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src32_3.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_src32_3:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_src64_0:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src64_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src64_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src64_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src64_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_src64_1:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src_exp.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src_exp.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src_exp.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_src_exp.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,28 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_src_exp:
+
+vsrc
+===========================
+
+Data to copy to export buffers. This is an optional operand. Must be specified as :ref:`off<amdgpu_synid_off>` if not used.
+
+:ref:`compr<amdgpu_synid_compr>` modifier indicates use of compressed (16-bit) data. This limits number of source operands from 4 to 2:
+
+* src0 and src1 must specify the first register (or :ref:`off<amdgpu_synid_off>`).
+* src2 and src3 must specify the second register (or :ref:`off<amdgpu_synid_off>`).
+
+An example:
+
+.. parsed-literal::
+
+  exp mrtz v3, v3, off, off compr
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`off<amdgpu_synid_off>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_ssrc32_0:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_ssrc32_1:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_2.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_2.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_2.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_2.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_ssrc32_2:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_3.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_3.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_3.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_3.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_ssrc32_3:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`iconst<amdgpu_synid_iconst>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_4.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_4.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_4.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc32_4.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_ssrc32_4:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_ssrc64_0:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc64_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc64_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc64_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc64_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_ssrc64_1:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc64_2.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc64_2.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc64_2.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc64_2.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_ssrc64_2:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc64_3.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc64_3.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc64_3.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_ssrc64_3.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_ssrc64_3:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`trap<amdgpu_synid_trap>`, :ref:`exec<amdgpu_synid_exec>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_tgt.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_tgt.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_tgt.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_tgt.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,24 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_tgt:
+
+tgt
+===========================
+
+An export target:
+
+    ============== ===================================
+    Syntax         Description
+    ============== ===================================
+    pos{0..3}      Copy vertex position 0..3.
+    param{0..31}   Copy vertex parameter 0..31.
+    mrt{0..7}      Copy pixel color to the MRTs 0..7.
+    mrtz           Copy pixel depth (Z) data.
+    null           Copy nothing.
+    ============== ===================================
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_type_dev.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_type_dev.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_type_dev.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_type_dev.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_type_dev:
+
+Type deviation
+===========================
+
+*Type* of this operand differs from *type* :ref:`implied by the opcode<amdgpu_syn_instruction_type>`. This tag specifies actual operand *type*.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_uimm16.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_uimm16.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_uimm16.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_uimm16.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_uimm16:
+
+imm16
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then zero-extended to 32 bits.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vcc_64.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vcc_64.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vcc_64.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vcc_64.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_vcc_64:
+
+vcc
+===========================
+
+Vector condition code.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`vcc<amdgpu_synid_vcc>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdata128_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdata128_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdata128_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdata128_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_vdata128_0:
+
+vdata
+===========================
+
+Instruction input.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdata32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdata32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdata32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdata32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_vdata32_0:
+
+vdata
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdata64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdata64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdata64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdata64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_vdata64_0:
+
+vdata
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdata96_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdata96_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdata96_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdata96_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_vdata96_0:
+
+vdata
+===========================
+
+Instruction input.
+
+*Size:* 3 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdst128_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdst128_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdst128_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdst128_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_vdst128_0:
+
+vdst
+===========================
+
+Instruction output.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdst32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdst32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdst32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdst32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_vdst32_0:
+
+vdst
+===========================
+
+Instruction output.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdst64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdst64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdst64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdst64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_vdst64_0:
+
+vdst
+===========================
+
+Instruction output.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdst96_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdst96_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdst96_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vdst96_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_vdst96_0:
+
+vdst
+===========================
+
+Instruction output.
+
+*Size:* 3 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vsrc128_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vsrc128_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vsrc128_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vsrc128_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_vsrc128_0:
+
+vsrc
+===========================
+
+Instruction input.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vsrc32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vsrc32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vsrc32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vsrc32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_vsrc32_0:
+
+vsrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vsrc32_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vsrc32_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vsrc32_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vsrc32_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_vsrc32_1:
+
+vsrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`lds_direct<amdgpu_synid_lds_direct>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vsrc64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vsrc64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vsrc64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_vsrc64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_vsrc64_0:
+
+vsrc
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_waitcnt.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_waitcnt.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_waitcnt.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx8_waitcnt.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,55 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid8_waitcnt:
+
+waitcnt
+===========================
+
+Counts of outstanding instructions to wait for.
+
+The bits of this operand have the following meaning:
+
+    ============ ======================================================
+    Bits         Description
+    ============ ======================================================
+    3:0          VM_CNT: vector memory operations count.
+    6:4          EXP_CNT: export count.
+    11:8         LGKM_CNT: LDS, GDS, Constant and Message count.
+    ============ ======================================================
+
+This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>`
+or as a combination of the following symbolic helpers:
+
+    ====================== ======================================================================
+    Syntax                 Description
+    ====================== ======================================================================
+    vmcnt(<*N*>)           VM_CNT value. *N* must not exceed the largest VM_CNT value.
+    expcnt(<*N*>)          EXP_CNT value. *N* must not exceed the largest EXP_CNT value.
+    lgkmcnt(<*N*>)         LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value.
+    vmcnt_sat(<*N*>)       VM_CNT value computed as min(*N*, the largest VM_CNT value).
+    expcnt_sat(<*N*>)      EXP_CNT value computed as min(*N*, the largest EXP_CNT value).
+    lgkmcnt_sat(<*N*>)     LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value).
+    ====================== ======================================================================
+
+These helpers may be specified in any order. Ampersands and commas may be used as optional separators.
+
+*N* is either an
+:ref:`integer number<amdgpu_synid_integer_number>` or an
+:ref:`absolute expression<amdgpu_synid_absolute_expression>`.
+
+Examples:
+
+.. parsed-literal::
+
+    s_waitcnt 0
+    s_waitcnt vmcnt(1)
+    s_waitcnt expcnt(2) lgkmcnt(3)
+    s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3)
+    s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3)
+    s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2)
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_addr_buf.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_addr_buf.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_addr_buf.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_addr_buf.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,22 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_addr_buf:
+
+vaddr
+===========================
+
+This is an optional operand which may specify offset and/or index.
+
+*Size:* 0, 1 or 2 dwords. Size is controlled by modifiers :ref:`offen<amdgpu_synid_offen>` and :ref:`idxen<amdgpu_synid_idxen>`:
+
+* If only :ref:`idxen<amdgpu_synid_idxen>` is specified, this operand supplies an index. Size is 1 dword.
+* If only :ref:`offen<amdgpu_synid_offen>` is specified, this operand supplies an offset. Size is 1 dword.
+* If both modifiers are specified, index is in the first register and offset is in the second. Size is 2 dwords.
+* If none of these modifiers are specified, this operand must be set to :ref:`off<amdgpu_synid_off>`.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`off<amdgpu_synid_off>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_addr_ds.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_addr_ds.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_addr_ds.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_addr_ds.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_addr_ds:
+
+vaddr
+===========================
+
+An offset from the start of GDS/LDS memory.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_addr_flat.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_addr_flat.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_addr_flat.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_addr_flat.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_addr_flat:
+
+vaddr
+===========================
+
+A 64-bit flat address.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_addr_mimg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_addr_mimg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_addr_mimg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_addr_mimg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_addr_mimg:
+
+vaddr
+===========================
+
+Image address which includes from one to four dimensional coordinates and other data used to locate a position in the image.
+
+*Size:* 1, 2, 3, 4, 8 or 16 dwords. Actual size depends on opcode, specific image being handled and :ref:`a16<amdgpu_synid_a16>`.
+
+  Note 1. Image format and dimensions are encoded in the image resource constant but not in the instruction.
+
+  Note 2. Actually image address size may vary from 1 to 13 dwords, but assembler currently supports a limited range of register sequences.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_attr.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_attr.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_attr.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_attr.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,30 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_attr:
+
+attr
+===========================
+
+Interpolation attribute and channel:
+
+    ============== ===================================
+    Syntax         Description
+    ============== ===================================
+    attr{0..32}.x  Attribute 0..32 with *x* channel.
+    attr{0..32}.y  Attribute 0..32 with *y* channel.
+    attr{0..32}.z  Attribute 0..32 with *z* channel.
+    attr{0..32}.w  Attribute 0..32 with *w* channel.
+    ============== ===================================
+
+Examples:
+
+.. parsed-literal::
+
+    v_interp_p1_f32 v1, v0, attr0.x
+    v_interp_p1_f32 v1, v0, attr32.w
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_base_smem_addr.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_base_smem_addr.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_base_smem_addr.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_base_smem_addr.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_base_smem_addr:
+
+sbase
+===========================
+
+A 64-bit base address for scalar memory operations.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_base_smem_buf.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_base_smem_buf.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_base_smem_buf.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_base_smem_buf.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_base_smem_buf:
+
+sbase
+===========================
+
+A 128-bit buffer resource constant for scalar memory operations which provides a base address, a size and a stride.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_base_smem_scratch.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_base_smem_scratch.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_base_smem_scratch.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_base_smem_scratch.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_base_smem_scratch:
+
+sbase
+===========================
+
+This operand is ignored by H/W and :ref:`flat_scratch<amdgpu_synid_flat_scratch>` is supplied instead.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_bimm16.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_bimm16.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_bimm16.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_bimm16.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_bimm16:
+
+imm16
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_bimm32.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_bimm32.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_bimm32.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_bimm32.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_bimm32:
+
+imm32
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 32 bits.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_buf_atomic128.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_buf_atomic128.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_buf_atomic128.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_buf_atomic128.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_data_buf_atomic128:
+
+vdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* 4 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_buf_atomic32.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_buf_atomic32.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_buf_atomic32.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_buf_atomic32.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_data_buf_atomic32:
+
+vdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* 1 dword by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_buf_atomic64.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_buf_atomic64.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_buf_atomic64.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_buf_atomic64.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_data_buf_atomic64:
+
+vdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* 2 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_mimg_atomic_cmp.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_mimg_atomic_cmp.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_mimg_atomic_cmp.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_mimg_atomic_cmp.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,27 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_data_mimg_atomic_cmp:
+
+vdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` and :ref:`tfe<amdgpu_synid_tfe>`:
+
+* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
+* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+  Note. The surface data format is indicated in the image resource constant but not in the instruction.
+
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_mimg_atomic_reg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_mimg_atomic_reg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_mimg_atomic_reg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_mimg_atomic_reg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,26 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_data_mimg_atomic_reg:
+
+vdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` and :ref:`tfe<amdgpu_synid_tfe>`:
+
+* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
+* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+  Note. The surface data format is indicated in the image resource constant but not in the instruction.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_mimg_store.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_mimg_store.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_mimg_store.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_mimg_store.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,18 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_data_mimg_store:
+
+vdata
+===========================
+
+Image data to store by an *image_store* instruction.
+
+*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` which may specify from 1 to 4 data elements. Each data element occupies 1 dword.
+
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_mimg_store_d16.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_mimg_store_d16.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_mimg_store_d16.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_mimg_store_d16.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_data_mimg_store_d16:
+
+vdata
+===========================
+
+Image data to store by an *image_store* instruction.
+
+*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` and :ref:`d16<amdgpu_synid_d16>`:
+
+* :ref:`dmask<amdgpu_synid_dmask>` may specify from 1 to 4 data elements. Each data element occupies either 32 bits or 16 bits depending on :ref:`d16<amdgpu_synid_d16>`.
+* :ref:`d16<amdgpu_synid_d16>` specifies that data in registers are packed; each value occupies 16 bits.
+
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_smem_atomic128.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_smem_atomic128.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_smem_atomic128.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_smem_atomic128.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_data_smem_atomic128:
+
+sdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_smem_atomic32.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_smem_atomic32.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_smem_atomic32.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_smem_atomic32.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_data_smem_atomic32:
+
+sdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_smem_atomic64.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_smem_atomic64.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_smem_atomic64.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_data_smem_atomic64.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_data_smem_atomic64:
+
+sdata
+===========================
+
+Input data for an atomic instruction.
+
+Optionally may serve as an output data:
+
+* If :ref:`glc<amdgpu_synid_glc>` is specified, gets the memory value before the operation.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_128.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_128.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_128.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_128.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_dst_buf_128:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer.
+
+*Size:* 4 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_32.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_32.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_32.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_32.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_dst_buf_32:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer.
+
+*Size:* 1 dword by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_64.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_64.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_64.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_64.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_dst_buf_64:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer.
+
+*Size:* 2 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_96.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_96.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_96.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_96.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_dst_buf_96:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer.
+
+*Size:* 3 dwords by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_lds.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_lds.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_lds.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_buf_lds.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,21 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_dst_buf_lds:
+
+vdst
+===========================
+
+Instruction output: data read from a memory buffer.
+
+If :ref:`lds<amdgpu_synid_lds>` is specified, this operand is ignored by H/W and data are stored directly into LDS.
+
+*Size:* 1 dword by default. :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+    Note that :ref:`tfe<amdgpu_synid_tfe>` and :ref:`lds<amdgpu_synid_lds>` cannot be used together.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_flat_atomic32.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_flat_atomic32.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_flat_atomic32.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_flat_atomic32.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,19 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_dst_flat_atomic32:
+
+vdst
+===========================
+
+Data returned by a 32-bit atomic flat instruction.
+
+This is an optional operand. It must be used if and only if :ref:`glc<amdgpu_synid_glc>` is specified.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_flat_atomic64.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_flat_atomic64.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_flat_atomic64.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_flat_atomic64.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,19 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_dst_flat_atomic64:
+
+vdst
+===========================
+
+Data returned by a 64-bit atomic flat instruction.
+
+This is an optional operand. It must be used if and only if :ref:`glc<amdgpu_synid_glc>` is specified.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_mimg_gather4.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_mimg_gather4.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_mimg_gather4.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_mimg_gather4.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,22 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_dst_mimg_gather4:
+
+vdst
+===========================
+
+Image data to load by an *image_gather4* instruction.
+
+*Size:* 4 data elements by default. Each data element occupies either 32 bits or 16 bits depending on :ref:`d16<amdgpu_synid_d16>`.
+
+:ref:`d16<amdgpu_synid_d16>` and :ref:`tfe<amdgpu_synid_tfe>` affect operand size as follows:
+
+* :ref:`d16<amdgpu_synid_d16>` specifies that data elements in registers are packed; each value occupies 16 bits.
+* :ref:`tfe<amdgpu_synid_tfe>` adds one dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_mimg_regular.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_mimg_regular.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_mimg_regular.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_mimg_regular.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,20 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_dst_mimg_regular:
+
+vdst
+===========================
+
+Image data to load by an image instruction.
+
+*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>` and :ref:`tfe<amdgpu_synid_tfe>`:
+
+* :ref:`dmask<amdgpu_synid_dmask>` may specify from 1 to 4 data elements. Each data element occupies 1 dword.
+* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_mimg_regular_d16.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_mimg_regular_d16.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_mimg_regular_d16.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_dst_mimg_regular_d16.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,22 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_dst_mimg_regular_d16:
+
+vdst
+===========================
+
+Image data to load by an image instruction.
+
+*Size:* depends on :ref:`dmask<amdgpu_synid_dmask>`, :ref:`tfe<amdgpu_synid_tfe>` and :ref:`d16<amdgpu_synid_d16>`:
+
+* :ref:`dmask<amdgpu_synid_dmask>` may specify from 1 to 4 data elements. Each data element occupies either 32 bits or 16 bits depending on :ref:`d16<amdgpu_synid_d16>`.
+* :ref:`d16<amdgpu_synid_d16>` specifies that data elements in registers are packed; each value occupies 16 bits.
+* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
+
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_fimm16.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_fimm16.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_fimm16.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_fimm16.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_fimm16:
+
+imm32
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The number is converted to *f16* as described :ref:`here<amdgpu_synid_lit_conv>`.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_fimm32.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_fimm32.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_fimm32.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_fimm32.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_fimm32:
+
+imm32
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The value is converted to *f32* as described :ref:`here<amdgpu_synid_lit_conv>`.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_hwreg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_hwreg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_hwreg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_hwreg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,61 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_hwreg:
+
+hwreg
+===========================
+
+Bits of a hardware register being accessed.
+
+The bits of this operand have the following meaning:
+
+    ============ ===================================
+    Bits         Description
+    ============ ===================================
+    5:0          Register *id*.
+    10:6         First bit *offset* (0..31).
+    15:11        *Size* in bits (1..32).
+    ============ ===================================
+
+This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below.
+
+    ==================================== ============================================================================
+    Syntax                               Description
+    ==================================== ============================================================================
+    hwreg({0..63})                       All bits of a register indicated by its *id*.
+    hwreg(<*name*>)                      All bits of a register indicated by its *name*.
+    hwreg({0..63}, {0..31}, {1..32})     Register bits indicated by register *id*, first bit *offset* and *size*.
+    hwreg(<*name*>, {0..31}, {1..32})    Register bits indicated by register *name*, first bit *offset* and *size*.
+    ==================================== ============================================================================
+
+Register *id*, *offset* and *size* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`.
+
+Defined register *names* include:
+
+    =================== ==========================================
+    Name                Description
+    =================== ==========================================
+    HW_REG_MODE         Shader writeable mode bits.
+    HW_REG_STATUS       Shader read-only status.
+    HW_REG_TRAPSTS      Trap status.
+    HW_REG_HW_ID        Id of wave, simd, compute unit, etc.
+    HW_REG_GPR_ALLOC    Per-wave SGPR and VGPR allocation.
+    HW_REG_LDS_ALLOC    Per-wave LDS allocation.
+    HW_REG_IB_STS       Counters of outstanding instructions.
+    HW_REG_SH_MEM_BASES Memory aperture.
+    =================== ==========================================
+
+Examples:
+
+.. parsed-literal::
+
+    s_getreg_b32 s2, 0x6
+    s_getreg_b32 s2, hwreg(15)
+    s_getreg_b32 s2, hwreg(51, 1, 31)
+    s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1)
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_imm4.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_imm4.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_imm4.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_imm4.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,25 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_imm4:
+
+imm4
+===========================
+
+A positive :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 4 bits.
+
+This operand is a mask which controls indexing mode for operands of subsequent instructions. Value 1 enables indexing and value 0 disables it.
+
+    ============ ========================================
+    Bit          Meaning
+    ============ ========================================
+    0            Enables or disables *src0* indexing.
+    1            Enables or disables *src1* indexing.
+    2            Enables or disables *src2* indexing.
+    3            Enables or disables *dst* indexing.
+    ============ ========================================
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_label.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_label.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_label.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_label.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,30 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_label:
+
+label
+===========================
+
+A branch target which is a 16-bit signed integer treated as a PC-relative dword offset.
+
+This operand may be specified as:
+
+* An :ref:`integer_number<amdgpu_synid_integer_number>`. The number is truncated to 16 bits.
+* An :ref:`absolute_expression<amdgpu_synid_absolute_expression>` which must start with an :ref:`integer_number<amdgpu_synid_integer_number>`. The value of the expression is truncated to 16 bits.
+* A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label). The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
+
+Examples:
+
+.. parsed-literal::
+
+  offset = 30
+  s_branch loop_end
+  s_branch 2 + offset
+  s_branch 32
+  loop_end:
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_mad_type_dev.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_mad_type_dev.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_mad_type_dev.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_mad_type_dev.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_mad_type_dev:
+
+fx
+===========================
+
+This is an *f32* or *f16* operand depending on instruction modifiers:
+
+* Operand size is controlled by :ref:`m_op_sel_hi<amdgpu_synid_mad_mix_op_sel_hi>`.
+* Location of 16-bit operand is controlled by :ref:`m_op_sel<amdgpu_synid_mad_mix_op_sel>`.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_mod_dpp_sdwa_abs_neg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_mod_dpp_sdwa_abs_neg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_mod_dpp_sdwa_abs_neg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_mod_dpp_sdwa_abs_neg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_mod_dpp_sdwa_abs_neg:
+
+m
+===========================
+
+This operand may be used with floating point operand modifiers :ref:`abs<amdgpu_synid_abs>` and :ref:`neg<amdgpu_synid_neg>`.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_mod_sdwa_sext.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_mod_sdwa_sext.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_mod_sdwa_sext.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_mod_sdwa_sext.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_mod_sdwa_sext:
+
+m
+===========================
+
+This operand may be used with integer operand modifier :ref:`sext<amdgpu_synid_sext>`.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_mod_vop3_abs_neg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_mod_vop3_abs_neg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_mod_vop3_abs_neg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_mod_vop3_abs_neg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_mod_vop3_abs_neg:
+
+m
+===========================
+
+This operand may be used with floating point operand modifiers :ref:`abs<amdgpu_synid_abs>` and :ref:`neg<amdgpu_synid_neg>`.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_msg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_msg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_msg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_msg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,75 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_msg:
+
+msg
+===========================
+
+A 16-bit message code. The bits of this operand have the following meaning:
+
+    ============ ======================================================
+    Bits         Description
+    ============ ======================================================
+    3:0          Message *type*.
+    6:4          Optional *operation*.
+    9:7          Optional *parameters*.
+    15:10        Unused.
+    ============ ======================================================
+
+This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below:
+
+    ======================================== ========================================================================
+    Syntax                                   Description
+    ======================================== ========================================================================
+    sendmsg(<*type*>)                        A message identified by its *type*.
+    sendmsg(<*type*>, <*op*>)                A message identified by its *type* and *operation*.
+    sendmsg(<*type*>, <*op*>, <*stream*>)    A message identified by its *type* and *operation* with a stream *id*.
+    ======================================== ========================================================================
+
+*Type* may be specified using message *name* or message *id*.
+
+*Op* may be specified using operation *name* or operation *id*.
+
+Stream *id* is an integer in the range 0..3.
+
+Message *id*, operation *id* and stream *id* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`.
+
+Each message type supports specific operations:
+
+    ================= ========== ============================== ============ ==========
+    Message name      Message Id Supported Operations           Operation Id Stream Id
+    ================= ========== ============================== ============ ==========
+    MSG_INTERRUPT     1          \-                             \-           \-
+    MSG_GS            2          GS_OP_CUT                      1            Optional
+    \                            GS_OP_EMIT                     2            Optional
+    \                            GS_OP_EMIT_CUT                 3            Optional
+    MSG_GS_DONE       3          GS_OP_NOP                      0            \-
+    \                            GS_OP_CUT                      1            Optional
+    \                            GS_OP_EMIT                     2            Optional
+    \                            GS_OP_EMIT_CUT                 3            Optional
+    MSG_GS_ALLOC_REQ  9          \-                             \-           \-
+    MSG_GET_DOORBELL  10         \-                             \-           \-
+    MSG_SYSMSG        15         SYSMSG_OP_ECC_ERR_INTERRUPT    1            \-
+    \                            SYSMSG_OP_REG_RD               2            \-
+    \                            SYSMSG_OP_HOST_TRAP_ACK        3            \-
+    \                            SYSMSG_OP_TTRACE_PC            4            \-
+    ================= ========== ============================== ============ ==========
+
+Examples:
+
+.. parsed-literal::
+
+    s_sendmsg 0x12
+    s_sendmsg sendmsg(MSG_INTERRUPT)
+    s_sendmsg sendmsg(MSG_GET_DOORBELL)
+    s_sendmsg sendmsg(2, GS_OP_CUT)
+    s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT)
+    s_sendmsg sendmsg(MSG_GS, 2)
+    s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1)
+    s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC)
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_offset_buf.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_offset_buf.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_offset_buf.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_offset_buf.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_offset_buf:
+
+soffset
+===========================
+
+An unsigned byte offset.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_offset_smem_buf.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_offset_smem_buf.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_offset_smem_buf.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_offset_smem_buf.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,19 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_offset_smem_buf:
+
+soffset
+===========================
+
+An unsigned byte offset added to the base address to get memory address.
+
+.. WARNING:: Assembler currently supports 20-bit offsets only. Use :ref:`uimm20<amdgpu_synid_uimm20>` instead of :ref:`uimm21<amdgpu_synid_uimm21>`.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`uimm21<amdgpu_synid_uimm21>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_offset_smem_plain.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_offset_smem_plain.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_offset_smem_plain.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_offset_smem_plain.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,22 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_offset_smem_plain:
+
+soffset
+===========================
+
+An offset added to the base address to get memory address.
+
+* If offset is specified as a register, it supplies an unsigned byte offset.
+* If offset is specified as a 21-bit immediate, it supplies a signed byte offset.
+
+.. WARNING:: Assembler currently supports 20-bit unsigned offsets only. Use :ref:`uimm20<amdgpu_synid_uimm20>` instead of :ref:`simm21<amdgpu_synid_simm21>`.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`simm21<amdgpu_synid_simm21>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_opt.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_opt.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_opt.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_opt.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_opt:
+
+opt
+===========================
+
+This is an optional operand. It must be used if and only if :ref:`glc<amdgpu_synid_glc>` is specified.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_param.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_param.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_param.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_param.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,22 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_param:
+
+param
+===========================
+
+Interpolation parameter to read:
+
+    ============ ===================================
+    Syntax       Description
+    ============ ===================================
+    p0           Parameter *P0*.
+    p10          Parameter *P10*.
+    p20          Parameter *P20*.
+    ============ ===================================
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_perm_smem.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_perm_smem.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_perm_smem.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_perm_smem.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,24 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_perm_smem:
+
+imm3
+===========================
+
+A bit mask which indicates request permissions.
+
+This operand must be specified as an :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 7 bits, but only 3 low bits are significant.
+
+    ============ ==============================
+    Bit Number   Description
+    ============ ==============================
+    0            Request *read* permission.
+    1            Request *write* permission.
+    2            Request *execute* permission.
+    ============ ==============================
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ret.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ret.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ret.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ret.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_ret:
+
+dst
+===========================
+
+This is an input operand. It may optionally serve as a destination if :ref:`glc<amdgpu_synid_glc>` is specified.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_rsrc_buf.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_rsrc_buf.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_rsrc_buf.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_rsrc_buf.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_rsrc_buf:
+
+srsrc
+===========================
+
+Buffer resource constant which defines the address and characteristics of the buffer in memory.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_rsrc_mimg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_rsrc_mimg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_rsrc_mimg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_rsrc_mimg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_rsrc_mimg:
+
+srsrc
+===========================
+
+Image resource constant which defines the location of the image buffer in memory, its dimensions, tiling, and data format.
+
+*Size:* 8 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_saddr_flat_global.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_saddr_flat_global.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_saddr_flat_global.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_saddr_flat_global.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,19 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_saddr_flat_global:
+
+saddr
+===========================
+
+An optional 64-bit flat global address. Must be specified as :ref:`off<amdgpu_synid_off>` if not used.
+
+See :ref:`vaddr<amdgpu_synid9_vaddr_flat_global>` for description of available addressing modes.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`off<amdgpu_synid_off>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_saddr_flat_scratch.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_saddr_flat_scratch.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_saddr_flat_scratch.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_saddr_flat_scratch.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,19 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_saddr_flat_scratch:
+
+saddr
+===========================
+
+An optional 32-bit flat scratch offset. Must be specified as :ref:`off<amdgpu_synid_off>` if not used.
+
+Either this operand or :ref:`vaddr<amdgpu_synid9_vaddr_flat_scratch>` must be set to :ref:`off<amdgpu_synid_off>`.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`off<amdgpu_synid_off>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_samp_mimg.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_samp_mimg.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_samp_mimg.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_samp_mimg.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_samp_mimg:
+
+ssamp
+===========================
+
+Sampler constant used to specify filtering options applied to the image data after it is read.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdata128_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdata128_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdata128_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdata128_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_sdata128_0:
+
+sdata
+===========================
+
+Instruction input.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdata32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdata32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdata32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdata32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_sdata32_0:
+
+sdata
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdata64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdata64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdata64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdata64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_sdata64_0:
+
+sdata
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst128_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst128_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst128_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst128_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_sdst128_0:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst256_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst256_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst256_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst256_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_sdst256_0:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 8 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_sdst32_0:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst32_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst32_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst32_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst32_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_sdst32_1:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst32_2.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst32_2.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst32_2.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst32_2.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_sdst32_2:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst512_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst512_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst512_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst512_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_sdst512_0:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 16 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_sdst64_0:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst64_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst64_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst64_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_sdst64_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_sdst64_1:
+
+sdst
+===========================
+
+Instruction output.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`exec<amdgpu_synid_exec>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_simm16.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_simm16.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_simm16.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_simm16.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_simm16:
+
+imm16
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then sign-extended to 32 bits.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_src32_0:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`lds_direct<amdgpu_synid_lds_direct>`, :ref:`constant<amdgpu_synid_constant>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src32_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src32_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src32_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src32_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_src32_1:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src32_2.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src32_2.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src32_2.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src32_2.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_src32_2:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`lds_direct<amdgpu_synid_lds_direct>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src32_3.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src32_3.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src32_3.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src32_3.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_src32_3:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_src64_0:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src64_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src64_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src64_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src64_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_src64_1:
+
+src
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src_exp.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src_exp.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src_exp.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_src_exp.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,28 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_src_exp:
+
+vsrc
+===========================
+
+Data to copy to export buffers. This is an optional operand. Must be specified as :ref:`off<amdgpu_synid_off>` if not used.
+
+:ref:`compr<amdgpu_synid_compr>` modifier indicates use of compressed (16-bit) data. This limits number of source operands from 4 to 2:
+
+* src0 and src1 must specify the first register (or :ref:`off<amdgpu_synid_off>`).
+* src2 and src3 must specify the second register (or :ref:`off<amdgpu_synid_off>`).
+
+An example:
+
+.. parsed-literal::
+
+  exp mrtz v3, v3, off, off compr
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`off<amdgpu_synid_off>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_ssrc32_0:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_ssrc32_1:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_2.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_2.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_2.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_2.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_ssrc32_2:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_3.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_3.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_3.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_3.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_ssrc32_3:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`iconst<amdgpu_synid_iconst>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_4.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_4.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_4.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc32_4.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_ssrc32_4:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`m0<amdgpu_synid_m0>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_ssrc64_0:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`, :ref:`literal<amdgpu_synid_literal>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc64_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc64_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc64_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc64_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_ssrc64_1:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc64_2.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc64_2.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc64_2.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc64_2.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_ssrc64_2:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`exec<amdgpu_synid_exec>`, :ref:`vccz<amdgpu_synid_vccz>`, :ref:`execz<amdgpu_synid_execz>`, :ref:`scc<amdgpu_synid_scc>`, :ref:`constant<amdgpu_synid_constant>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc64_3.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc64_3.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc64_3.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_ssrc64_3.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_ssrc64_3:
+
+ssrc
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`s<amdgpu_synid_s>`, :ref:`flat_scratch<amdgpu_synid_flat_scratch>`, :ref:`xnack<amdgpu_synid_xnack>`, :ref:`vcc<amdgpu_synid_vcc>`, :ref:`ttmp<amdgpu_synid_ttmp>`, :ref:`exec<amdgpu_synid_exec>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_tgt.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_tgt.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_tgt.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_tgt.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,24 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_tgt:
+
+tgt
+===========================
+
+An export target:
+
+    ============== ===================================
+    Syntax         Description
+    ============== ===================================
+    pos{0..3}      Copy vertex position 0..3.
+    param{0..31}   Copy vertex parameter 0..31.
+    mrt{0..7}      Copy pixel color to the MRTs 0..7.
+    mrtz           Copy pixel depth (Z) data.
+    null           Copy nothing.
+    ============== ===================================
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_type_dev.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_type_dev.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_type_dev.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_type_dev.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_type_dev:
+
+Type deviation
+===========================
+
+*Type* of this operand differs from *type* :ref:`implied by the opcode<amdgpu_syn_instruction_type>`. This tag specifies actual operand *type*.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_uimm16.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_uimm16.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_uimm16.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_uimm16.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,14 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_uimm16:
+
+imm16
+===========================
+
+An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then zero-extended to 32 bits.
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vaddr_flat_global.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vaddr_flat_global.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vaddr_flat_global.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vaddr_flat_global.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,22 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_vaddr_flat_global:
+
+vaddr
+===========================
+
+A 64-bit flat global address or a 32-bit offset depending on addressing mode:
+
+* Address = :ref:`vaddr<amdgpu_synid9_vaddr_flat_global>` + :ref:`offset13s<amdgpu_synid_flat_offset13s>`. :ref:`vaddr<amdgpu_synid9_vaddr_flat_global>` is a 64-bit address. This mode is indicated by :ref:`saddr<amdgpu_synid9_saddr_flat_global>` set to :ref:`off<amdgpu_synid_off>`.
+* Address = :ref:`saddr<amdgpu_synid9_saddr_flat_global>` + :ref:`vaddr<amdgpu_synid9_vaddr_flat_global>` + :ref:`offset13s<amdgpu_synid_flat_offset13s>`. :ref:`vaddr<amdgpu_synid9_vaddr_flat_global>` is a 32-bit offset. This mode is used when :ref:`saddr<amdgpu_synid9_saddr_flat_global>` is not :ref:`off<amdgpu_synid_off>`.
+
+.. WARNING:: Assembler currently expects a 64-bit *vaddr* regardless of addressing mode. This have to be fixed.
+
+*Size:* 1 or 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vaddr_flat_scratch.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vaddr_flat_scratch.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vaddr_flat_scratch.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vaddr_flat_scratch.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,19 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_vaddr_flat_scratch:
+
+vaddr
+===========================
+
+An optional 32-bit flat scratch offset. Must be specified as :ref:`off<amdgpu_synid_off>` if not used.
+
+Either this operand or :ref:`saddr<amdgpu_synid9_saddr_flat_scratch>` must be set to :ref:`off<amdgpu_synid_off>`.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`off<amdgpu_synid_off>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vcc_64.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vcc_64.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vcc_64.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vcc_64.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_vcc_64:
+
+vcc
+===========================
+
+Vector condition code.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`vcc<amdgpu_synid_vcc>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdata128_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdata128_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdata128_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdata128_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_vdata128_0:
+
+vdata
+===========================
+
+Instruction input.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdata32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdata32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdata32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdata32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_vdata32_0:
+
+vdata
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdata64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdata64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdata64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdata64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_vdata64_0:
+
+vdata
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdata96_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdata96_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdata96_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdata96_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_vdata96_0:
+
+vdata
+===========================
+
+Instruction input.
+
+*Size:* 3 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdst128_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdst128_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdst128_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdst128_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_vdst128_0:
+
+vdst
+===========================
+
+Instruction output.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdst32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdst32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdst32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdst32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_vdst32_0:
+
+vdst
+===========================
+
+Instruction output.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdst64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdst64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdst64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdst64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_vdst64_0:
+
+vdst
+===========================
+
+Instruction output.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdst96_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdst96_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdst96_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vdst96_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_vdst96_0:
+
+vdst
+===========================
+
+Instruction output.
+
+*Size:* 3 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vsrc128_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vsrc128_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vsrc128_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vsrc128_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_vsrc128_0:
+
+vsrc
+===========================
+
+Instruction input.
+
+*Size:* 4 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vsrc32_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vsrc32_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vsrc32_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vsrc32_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_vsrc32_0:
+
+vsrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vsrc32_1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vsrc32_1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vsrc32_1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vsrc32_1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_vsrc32_1:
+
+vsrc
+===========================
+
+Instruction input.
+
+*Size:* 1 dword.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`, :ref:`lds_direct<amdgpu_synid_lds_direct>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vsrc64_0.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vsrc64_0.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vsrc64_0.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_vsrc64_0.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,17 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_vsrc64_0:
+
+vsrc
+===========================
+
+Instruction input.
+
+*Size:* 2 dwords.
+
+*Operands:* :ref:`v<amdgpu_synid_v>`

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_waitcnt.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_waitcnt.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_waitcnt.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPU/gfx9_waitcnt.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,56 @@
+..
+    **************************************************
+    *                                                *
+    *   Automatically generated file, do not edit!   *
+    *                                                *
+    **************************************************
+
+.. _amdgpu_synid9_waitcnt:
+
+waitcnt
+===========================
+
+Counts of outstanding instructions to wait for.
+
+The bits of this operand have the following meaning:
+
+    ============ ======================================================
+    Bits         Description
+    ============ ======================================================
+    3:0          VM_CNT: vector memory operations count, lower bits.
+    6:4          EXP_CNT: export count.
+    11:8         LGKM_CNT: LDS, GDS, Constant and Message count.
+    15:14        VM_CNT: vector memory operations count, upper bits.
+    ============ ======================================================
+
+This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>`
+or as a combination of the following symbolic helpers:
+
+    ====================== ======================================================================
+    Syntax                 Description
+    ====================== ======================================================================
+    vmcnt(<*N*>)           VM_CNT value. *N* must not exceed the largest VM_CNT value.
+    expcnt(<*N*>)          EXP_CNT value. *N* must not exceed the largest EXP_CNT value.
+    lgkmcnt(<*N*>)         LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value.
+    vmcnt_sat(<*N*>)       VM_CNT value computed as min(*N*, the largest VM_CNT value).
+    expcnt_sat(<*N*>)      EXP_CNT value computed as min(*N*, the largest EXP_CNT value).
+    lgkmcnt_sat(<*N*>)     LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value).
+    ====================== ======================================================================
+
+These helpers may be specified in any order. Ampersands and commas may be used as optional separators.
+
+*N* is either an
+:ref:`integer number<amdgpu_synid_integer_number>` or an
+:ref:`absolute expression<amdgpu_synid_absolute_expression>`.
+
+Examples:
+
+.. parsed-literal::
+
+    s_waitcnt 0
+    s_waitcnt vmcnt(1)
+    s_waitcnt expcnt(2) lgkmcnt(3)
+    s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3)
+    s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3)
+    s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2)
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPUInstructionNotation.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPUInstructionNotation.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPUInstructionNotation.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPUInstructionNotation.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,110 @@
+============================
+AMDGPU Instructions Notation
+============================
+
+.. contents::
+   :local:
+
+.. _amdgpu_syn_instruction_notation:
+
+Introduction
+============
+
+This is an overview of notation used to describe the syntax of AMDGPU assembler instructions.
+
+This notation mimics the :ref:`syntax of assembler instructions<amdgpu_syn_instructions>`
+except that instead of real operands and modifiers it provides references to their description.
+
+Instructions
+============
+
+Notation
+~~~~~~~~
+
+This is the notation used to describe AMDGPU instructions:
+
+    ``<``\ :ref:`opcode description<amdgpu_syn_opcode_notation>`\ ``>  <``\ :ref:`operands description<amdgpu_syn_instruction_operands_notation>`\ ``>  <``\ :ref:`modifiers description<amdgpu_syn_instruction_modifiers_notation>`\ ``>``
+
+.. _amdgpu_syn_opcode_notation:
+
+Opcode
+======
+
+Notation
+~~~~~~~~
+
+TBD
+
+.. _amdgpu_syn_instruction_operands_notation:
+
+Operands
+========
+
+An instruction may have zero or more *operands*. They are comma-separated in the description:
+
+    ``<``\ :ref:`description of operand 0<amdgpu_syn_instruction_operand_notation>`\ ``>, <``\ :ref:`description of operand 1<amdgpu_syn_instruction_operand_notation>`\ ``>, ...``
+
+The order of *operands* is fixed. *Operands* cannot be omitted
+except for special cases described below.
+
+.. _amdgpu_syn_instruction_operand_notation:
+
+Notation
+~~~~~~~~
+
+An operand is described using the following notation:
+
+    *<name><tag0><tag1>...*
+
+Where:
+
+* *name* is a link to a description of the operand.
+* *tags* are optional. They are used to indicate special operand properties:
+
+.. _amdgpu_syn_instruction_operand_tags:
+
+    ============== =================================================================================
+    Operand tag    Meaning
+    ============== =================================================================================
+    :opt           An optional operand.
+    :m             An operand which may be used with
+                   :ref:`VOP3 operand modifiers<amdgpu_synid_vop3_operand_modifiers>` or
+                   :ref:`SDWA operand modifiers<amdgpu_synid_sdwa_operand_modifiers>`.
+    :dst           An input operand which may also serve as a destination
+                   if :ref:`glc<amdgpu_synid_glc>` modifier is specified.
+    :fx            This is an *f32* or *f16* operand depending on
+                   :ref:`m_op_sel_hi<amdgpu_synid_mad_mix_op_sel_hi>` modifier.
+    :<type>        Operand *type* differs from *type*
+                   :ref:`implied by the opcode name<amdgpu_syn_instruction_type>`.
+                   This tag specifies actual operand *type*.
+    ============== =================================================================================
+
+Examples:
+
+.. parsed-literal::
+
+    src1:m             // src1 operand may be used with operand modifiers
+    vdata:dst          // vdata operand may be used as both source and destination
+    vdst:u32           // vdst operand has u32 type
+
+.. _amdgpu_syn_instruction_modifiers_notation:
+
+Modifiers
+=========
+
+An instruction may have zero or more optional *modifiers*. They are space-separated in the description:
+
+    ``<``\ :ref:`description of modifier 0<amdgpu_syn_instruction_modifier_notation>`\ ``> <``\ :ref:`description of modifier 1<amdgpu_syn_instruction_modifier_notation>`\ ``> ...``
+
+The order of *modifiers* is fixed.
+
+.. _amdgpu_syn_instruction_modifier_notation:
+
+Notation
+~~~~~~~~
+
+A *modifier* is described using the following notation:
+
+    *<name>*
+
+Where *name* is a link to a description of the *modifier*.

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPUInstructionSyntax.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPUInstructionSyntax.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPUInstructionSyntax.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPUInstructionSyntax.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,172 @@
+=========================
+AMDGPU Instruction Syntax
+=========================
+
+.. contents::
+   :local:
+
+.. _amdgpu_syn_instructions:
+
+Instructions
+============
+
+Syntax
+~~~~~~
+
+An instruction has the following syntax:
+
+    ``<``\ *opcode mnemonic*\ ``>    <``\ *operand0*\ ``>, <``\ *operand1*\ ``>,...    <``\ *modifier0*\ ``> <``\ *modifier1*\ ``>...``
+
+:doc:`Operands<AMDGPUOperandSyntax>` are normally comma-separated while
+:doc:`modifiers<AMDGPUModifierSyntax>` are space-separated.
+
+The order of *operands* and *modifiers* is fixed.
+Most *modifiers* are optional and may be omitted.
+
+.. _amdgpu_syn_instruction_mnemo:
+
+Opcode Mnemonic
+~~~~~~~~~~~~~~~
+
+Opcode mnemonic describes opcode semantics and may include one or more suffices in this order:
+
+* :ref:`Destination operand type suffix<amdgpu_syn_instruction_type>`.
+* :ref:`Source operand type suffix<amdgpu_syn_instruction_type>`.
+* :ref:`Encoding suffix<amdgpu_syn_instruction_enc>`.
+
+.. _amdgpu_syn_instruction_type:
+
+Type and Size Suffices
+~~~~~~~~~~~~~~~~~~~~~~
+
+Instructions which operate with data have an implied type of *data* operands.
+This data type is specified as a suffix of instruction mnemonic.
+
+There are instructions which have 2 type suffices:
+the first is the data type of the destination operand,
+the second is the data type of source *data* operand(s).
+
+Note that data type specified by an instruction does not apply
+to other kinds of operands such as *addresses*, *offsets* and so on.
+
+The following table enumerates the most frequently used type suffices.
+
+    ============================================ ======================= =================
+    Type Suffices                                Packed instruction?     Data Type
+    ============================================ ======================= =================
+    _b512, _b256, _b128, _b64, _b32, _b16, _b8   No                      Bits.
+    _u64, _u32, _u16, _u8                        No                      Unsigned integer.
+    _i64, _i32, _i16, _i8                        No                      Signed integer.
+    _f64, _f32, _f16                             No                      Floating-point.
+    _b16, _u16, _i16, _f16                       Yes                     Packed.
+    ============================================ ======================= =================
+
+Instructions which have no type suffices are assumed to operate with typeless data.
+The size of data is specified by size suffices:
+
+    ================= =================== =====================================
+    Size Suffix       Implied data type   Required register size in dwords
+    ================= =================== =====================================
+    \-                b32                 1
+    x2                b64                 2
+    x3                b96                 3
+    x4                b128                4
+    x8                b256                8
+    x16               b512                16
+    x                 b32                 1
+    xy                b64                 2
+    xyz               b96                 3
+    xyzw              b128                4
+    d16_x             b16                 1
+    d16_xy            b16x2               2 for GFX8.0, 1 for GFX8.1 and GFX9
+    d16_xyz           b16x3               3 for GFX8.0, 2 for GFX8.1 and GFX9
+    d16_xyzw          b16x4               4 for GFX8.0, 2 for GFX8.1 and GFX9
+    ================= =================== =====================================
+
+.. WARNING::
+    There are exceptions from rules described above.
+    Operands which have type different from type specified by the opcode are
+    :ref:`tagged<amdgpu_syn_instruction_operand_tags>` in the description.
+
+Examples of instructions with different types of source and destination operands:
+
+.. parsed-literal::
+
+    s_bcnt0_i32_b64
+    v_cvt_f32_u32
+
+Examples of instructions with one data type:
+
+.. parsed-literal::
+
+    v_max3_f32
+    v_max3_i16
+
+Examples of instructions which operate with packed data:
+
+.. parsed-literal::
+
+    v_pk_add_u16
+    v_pk_add_i16
+    v_pk_add_f16
+
+Examples of typeless instructions which operate on b128 data:
+
+.. parsed-literal::
+
+    buffer_store_dwordx4
+    flat_load_dwordx4
+
+.. _amdgpu_syn_instruction_enc:
+
+Encoding Suffices
+~~~~~~~~~~~~~~~~~
+
+Most *VOP1*, *VOP2* and *VOPC* instructions have several variants:
+they may also be encoded in *VOP3*, *DPP* and *SDWA* formats.
+
+The assembler will automatically use optimal encoding based on instruction operands.
+To force specific encoding, one can add a suffix to the opcode of the instruction:
+
+    =================================================== =================
+    Encoding                                            Encoding Suffix
+    =================================================== =================
+    Native 32-bit encoding (*VOP1*, *VOP2* or *VOPC*)   _e32
+    *VOP3* (64-bit) encoding                            _e64
+    *DPP* encoding                                      _dpp
+    *SDWA* encoding                                     _sdwa
+    =================================================== =================
+
+These suffices are used in this reference to indicate the assumed encoding.
+When no suffix is specified, a native encoding is implied.
+
+Operands
+========
+
+Syntax
+~~~~~~
+
+Syntax of most operands is described :doc:`in this document<AMDGPUOperandSyntax>`.
+
+For detailed information about operands follow *operand links* in GPU-specific documents:
+
+* :doc:`GFX7<AMDGPU/AMDGPUAsmGFX7>`
+* :doc:`GFX8<AMDGPU/AMDGPUAsmGFX8>`
+* :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>`
+* :doc:`GFX10<AMDGPU/AMDGPUAsmGFX10>`
+
+Modifiers
+=========
+
+Syntax
+~~~~~~
+
+Syntax of modifiers is described :doc:`in this document<AMDGPUModifierSyntax>`.
+
+Information about modifiers supported for individual instructions may be found in GPU-specific documents:
+
+* :doc:`GFX7<AMDGPU/AMDGPUAsmGFX7>`
+* :doc:`GFX8<AMDGPU/AMDGPUAsmGFX8>`
+* :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>`
+* :doc:`GFX10<AMDGPU/AMDGPUAsmGFX10>`
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPUModifierSyntax.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPUModifierSyntax.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPUModifierSyntax.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPUModifierSyntax.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,1499 @@
+======================================
+Syntax of AMDGPU Instruction Modifiers
+======================================
+
+.. contents::
+   :local:
+
+Conventions
+===========
+
+The following notation is used throughout this document:
+
+    =================== =============================================================
+    Notation            Description
+    =================== =============================================================
+    {0..N}              Any integer value in the range from 0 to N (inclusive).
+    <x>                 Syntax and meaning of *x* is explained elsewhere.
+    =================== =============================================================
+
+.. _amdgpu_syn_modifiers:
+
+Modifiers
+=========
+
+DS Modifiers
+------------
+
+.. _amdgpu_synid_ds_offset8:
+
+offset8
+~~~~~~~
+
+Specifies an immediate unsigned 8-bit offset, in bytes. The default value is 0.
+
+Used with DS instructions which have 2 addresses.
+
+    =================== =====================================================
+    Syntax              Description
+    =================== =====================================================
+    offset:{0..0xFF}    Specifies an unsigned 8-bit offset as a positive
+                        :ref:`integer number <amdgpu_synid_integer_number>`.
+    =================== =====================================================
+
+Examples:
+
+.. parsed-literal::
+
+  offset:255
+  offset:0xff
+
+.. _amdgpu_synid_ds_offset16:
+
+offset16
+~~~~~~~~
+
+Specifies an immediate unsigned 16-bit offset, in bytes. The default value is 0.
+
+Used with DS instructions which have 1 address.
+
+    ==================== ======================================================
+    Syntax               Description
+    ==================== ======================================================
+    offset:{0..0xFFFF}   Specifies an unsigned 16-bit offset as a positive
+                         :ref:`integer number <amdgpu_synid_integer_number>`.
+    ==================== ======================================================
+
+Examples:
+
+.. parsed-literal::
+
+  offset:65535
+  offset:0xffff
+
+.. _amdgpu_synid_sw_offset16:
+
+swizzle pattern
+~~~~~~~~~~~~~~~
+
+This is a special modifier which may be used with *ds_swizzle_b32* instruction only.
+It specifies a swizzle pattern in numeric or symbolic form. The default value is 0.
+
+See AMD documentation for more information.
+
+    ======================================================= ===========================================================
+    Syntax                                                  Description
+    ======================================================= ===========================================================
+    offset:{0..0xFFFF}                                      Specifies a 16-bit swizzle pattern.
+    offset:swizzle(QUAD_PERM,{0..3},{0..3},{0..3},{0..3})   Specifies a quad permute mode pattern
+
+                                                            Each number is a lane *id*.
+    offset:swizzle(BITMASK_PERM, "<mask>")                  Specifies a bitmask permute mode pattern.
+
+                                                            The pattern converts a 5-bit lane *id* to another
+                                                            lane *id* with which the lane interacts.
+
+                                                            *mask* is a 5 character sequence which
+                                                            specifies how to transform the bits of the
+                                                            lane *id*. 
+
+                                                            The following characters are allowed:
+
+                                                            * "0" - set bit to 0.
+
+                                                            * "1" - set bit to 1.
+
+                                                            * "p" - preserve bit.
+
+                                                            * "i" - inverse bit.
+
+    offset:swizzle(BROADCAST,{2..32},{0..N})                Specifies a broadcast mode.
+
+                                                            Broadcasts the value of any particular lane to
+                                                            all lanes in its group.
+
+                                                            The first numeric parameter is a group
+                                                            size and must be equal to 2, 4, 8, 16 or 32.
+
+                                                            The second numeric parameter is an index of the
+                                                            lane being broadcasted. 
+
+                                                            The index must not exceed group size.
+    offset:swizzle(SWAP,{1..16})                            Specifies a swap mode.
+
+                                                            Swaps the neighboring groups of
+                                                            1, 2, 4, 8 or 16 lanes.
+    offset:swizzle(REVERSE,{2..32})                         Specifies a reverse mode.
+
+                                                            Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes.
+    ======================================================= ===========================================================
+
+Numeric parameters may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
+:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+
+Examples:
+
+.. parsed-literal::
+
+  offset:255
+  offset:0xffff
+  offset:swizzle(QUAD_PERM, 0, 1, 2 ,3)
+  offset:swizzle(BITMASK_PERM, "01pi0")
+  offset:swizzle(BROADCAST, 2, 0)
+  offset:swizzle(SWAP, 8)
+  offset:swizzle(REVERSE, 30 + 2)
+
+.. _amdgpu_synid_gds:
+
+gds
+~~~
+
+Specifies whether to use GDS or LDS memory (LDS is the default).
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    gds                                      Use GDS memory.
+    ======================================== ================================================
+
+
+EXP Modifiers
+-------------
+
+.. _amdgpu_synid_done:
+
+done
+~~~~
+
+Specifies if this is the last export from the shader to the target. By default,
+*exp* instruction does not finish an export sequence.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    done                                     Indicates the last export operation.
+    ======================================== ================================================
+
+.. _amdgpu_synid_compr:
+
+compr
+~~~~~
+
+Indicates if the data are compressed (data are not compressed by default).
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    compr                                    Data are compressed.
+    ======================================== ================================================
+
+.. _amdgpu_synid_vm:
+
+vm
+~~
+
+Specifies valid mask flag state (off by default).
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    vm                                       Set valid mask flag.
+    ======================================== ================================================
+
+FLAT Modifiers
+--------------
+
+.. _amdgpu_synid_flat_offset12:
+
+offset12
+~~~~~~~~
+
+Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
+
+Cannot be used with *global/scratch* opcodes. GFX9 only.
+
+    ================= ======================================================
+    Syntax            Description
+    ================= ======================================================
+    offset:{0..4095}  Specifies a 12-bit unsigned offset as a positive
+                      :ref:`integer number <amdgpu_synid_integer_number>`.
+    ================= ======================================================
+
+Examples:
+
+.. parsed-literal::
+
+  offset:4095
+  offset:0xff
+
+.. _amdgpu_synid_flat_offset13s:
+
+offset13s
+~~~~~~~~~
+
+Specifies an immediate signed 13-bit offset, in bytes. The default value is 0.
+
+Can be used with *global/scratch* opcodes only. GFX9 only.
+
+    ============================ =======================================================
+    Syntax                       Description
+    ============================ =======================================================
+    offset:{-4096..4095}         Specifies a 13-bit signed offset as an
+                                 :ref:`integer number <amdgpu_synid_integer_number>`.
+    ============================ =======================================================
+
+Examples:
+
+.. parsed-literal::
+
+  offset:-4000
+  offset:0x10
+
+.. _amdgpu_synid_flat_offset12s:
+
+offset12s
+~~~~~~~~~
+
+Specifies an immediate signed 12-bit offset, in bytes. The default value is 0.
+
+Can be used with *global/scratch* opcodes only.
+
+GFX10 only.
+
+    ============================ =======================================================
+    Syntax                       Description
+    ============================ =======================================================
+    offset:{-2048..2047}         Specifies a 12-bit signed offset as an
+                                 :ref:`integer number <amdgpu_synid_integer_number>`.
+    ============================ =======================================================
+
+Examples:
+
+.. parsed-literal::
+
+  offset:-2000
+  offset:0x10
+
+.. _amdgpu_synid_flat_offset11:
+
+offset11
+~~~~~~~~
+
+Specifies an immediate unsigned 11-bit offset, in bytes. The default value is 0.
+
+Cannot be used with *global/scratch* opcodes.
+
+GFX10 only.
+
+    ================= ======================================================
+    Syntax            Description
+    ================= ======================================================
+    offset:{0..2047}  Specifies an 11-bit unsigned offset as a positive
+                      :ref:`integer number <amdgpu_synid_integer_number>`.
+    ================= ======================================================
+
+Examples:
+
+.. parsed-literal::
+
+  offset:2047
+  offset:0xff
+
+dlc
+~~~
+
+See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
+
+glc
+~~~
+
+See a description :ref:`here<amdgpu_synid_glc>`.
+
+lds
+~~~
+
+See a description :ref:`here<amdgpu_synid_lds>`. GFX10 only.
+
+slc
+~~~
+
+See a description :ref:`here<amdgpu_synid_slc>`.
+
+tfe
+~~~
+
+See a description :ref:`here<amdgpu_synid_tfe>`.
+
+nv
+~~
+
+See a description :ref:`here<amdgpu_synid_nv>`.
+
+MIMG Modifiers
+--------------
+
+.. _amdgpu_synid_dmask:
+
+dmask
+~~~~~
+
+Specifies which channels (image components) are used by the operation. By default, no channels
+are used.
+
+    =============== =====================================================
+    Syntax          Description
+    =============== =====================================================
+    dmask:{0..15}   Specifies image channels as a positive
+                    :ref:`integer number <amdgpu_synid_integer_number>`.
+
+                    Each bit corresponds to one of 4 image
+                    components (RGBA).
+
+                    If the specified bit value
+                    is 0, the component is not used, value 1 means
+                    that the component is used.
+    =============== =====================================================
+
+This modifier has some limitations depending on instruction kind:
+
+    =================================================== ========================
+    Instruction Kind                                    Valid dmask Values
+    =================================================== ========================
+    32-bit atomic *cmpswap*                             0x3
+    32-bit atomic instructions except for *cmpswap*     0x1
+    64-bit atomic *cmpswap*                             0xF
+    64-bit atomic instructions except for *cmpswap*     0x3
+    *gather4*                                           0x1, 0x2, 0x4, 0x8
+    Other instructions                                  any value
+    =================================================== ========================
+
+Examples:
+
+.. parsed-literal::
+
+  dmask:0xf
+  dmask:0b1111
+  dmask:3
+
+.. _amdgpu_synid_unorm:
+
+unorm
+~~~~~
+
+Specifies whether the address is normalized or not (the address is normalized by default).
+
+    ======================== ========================================
+    Syntax                   Description
+    ======================== ========================================
+    unorm                    Force the address to be unnormalized.
+    ======================== ========================================
+
+glc
+~~~
+
+See a description :ref:`here<amdgpu_synid_glc>`.
+
+slc
+~~~
+
+See a description :ref:`here<amdgpu_synid_slc>`.
+
+.. _amdgpu_synid_r128:
+
+r128
+~~~~
+
+Specifies texture resource size. The default size is 256 bits.
+
+GFX7, GFX8 and GFX10 only.
+
+    =================== ================================================
+    Syntax              Description
+    =================== ================================================
+    r128                Specifies 128 bits texture resource size.
+    =================== ================================================
+
+.. WARNING:: Using this modifier should descrease *rsrc* operand size from 8 to 4 dwords, but assembler does not currently support this feature.
+
+tfe
+~~~
+
+See a description :ref:`here<amdgpu_synid_tfe>`.
+
+.. _amdgpu_synid_lwe:
+
+lwe
+~~~
+
+Specifies LOD warning status (LOD warning is disabled by default).
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    lwe                                      Enables LOD warning.
+    ======================================== ================================================
+
+.. _amdgpu_synid_da:
+
+da
+~~
+
+Specifies if an array index must be sent to TA. By default, array index is not sent.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    da                                       Send an array-index to TA.
+    ======================================== ================================================
+
+.. _amdgpu_synid_d16:
+
+d16
+~~~
+
+Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    d16                                      Enables 16-bits data mode.
+
+                                             On loads, convert data in memory to 16-bit
+                                             format before storing it in VGPRs.
+
+                                             For stores, convert 16-bit data in VGPRs to
+                                             32 bits before going to memory.
+
+                                             Note that GFX8.0 does not support data packing.
+                                             Each 16-bit data element occupies 1 VGPR.
+
+                                             GFX8.1, GFX9 and GFX10 support data packing.
+                                             Each pair of 16-bit data elements 
+                                             occupies 1 VGPR.
+    ======================================== ================================================
+
+.. _amdgpu_synid_a16:
+
+a16
+~~~
+
+Specifies size of image address components: 16 or 32 bits (32 bits by default).
+GFX9 and GFX10 only.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    a16                                      Enables 16-bits image address components.
+    ======================================== ================================================
+
+.. _amdgpu_synid_dim:
+
+dim
+~~~
+
+Specifies surface dimension. This is a mandatory modifier. There is no default value.
+
+GFX10 only.
+
+    =============================== =========================================================
+    Syntax                          Description
+    =============================== =========================================================
+    dim:1D                          One-dimensional image.
+    dim:2D                          Two-dimensional image.
+    dim:3D                          Three-dimensional image.
+    dim:CUBE                        Cubemap array.
+    dim:1D_ARRAY                    One-dimensional image array.
+    dim:2D_ARRAY                    Two-dimensional image array.
+    dim:2D_MSAA                     Two-dimensional multi-sample auto-aliasing image.
+    dim:2D_MSAA_ARRAY               Two-dimensional multi-sample auto-aliasing image array.
+    =============================== =========================================================
+
+The following table defines an alternative syntax which is supported
+for compatibility with SP3 assembler:
+
+    =============================== =========================================================
+    Syntax                          Description
+    =============================== =========================================================
+    dim:SQ_RSRC_IMG_1D              One-dimensional image.
+    dim:SQ_RSRC_IMG_2D              Two-dimensional image.
+    dim:SQ_RSRC_IMG_3D              Three-dimensional image.
+    dim:SQ_RSRC_IMG_CUBE            Cubemap array.
+    dim:SQ_RSRC_IMG_1D_ARRAY        One-dimensional image array.
+    dim:SQ_RSRC_IMG_2D_ARRAY        Two-dimensional image array.
+    dim:SQ_RSRC_IMG_2D_MSAA         Two-dimensional multi-sample auto-aliasing image.
+    dim:SQ_RSRC_IMG_2D_MSAA_ARRAY   Two-dimensional multi-sample auto-aliasing image array.
+    =============================== =========================================================
+
+dlc
+~~~
+
+See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
+
+Miscellaneous Modifiers
+-----------------------
+
+.. _amdgpu_synid_dlc:
+
+dlc
+~~~
+
+Controls device level cache policy for memory operations. Used for synchronization.
+When specified, forces operation to bypass device level cache making the operation device
+level coherent. By default, instructions use device level cache.
+
+GFX10 only.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    dlc                                      Bypass device level cache.
+    ======================================== ================================================
+
+.. _amdgpu_synid_glc:
+
+glc
+~~~
+
+This modifier has different meaning for loads, stores, and atomic operations.
+The default value is off (0).
+
+See AMD documentation for details.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    glc                                      Set glc bit to 1.
+    ======================================== ================================================
+
+.. _amdgpu_synid_lds:
+
+lds
+~~~
+
+Specifies where to store the result: VGPRs or LDS (VGPRs by default).
+
+    ======================================== ===========================
+    Syntax                                   Description
+    ======================================== ===========================
+    lds                                      Store result in LDS.
+    ======================================== ===========================
+
+.. _amdgpu_synid_nv:
+
+nv
+~~
+
+Specifies if instruction is operating on non-volatile memory. By default, memory is volatile.
+
+GFX9 only.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    nv                                       Indicates that instruction operates on
+                                             non-volatile memory.
+    ======================================== ================================================
+
+.. _amdgpu_synid_slc:
+
+slc
+~~~
+
+Specifies cache policy. The default value is off (0).
+
+See AMD documentation for details.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    slc                                      Set slc bit to 1.
+    ======================================== ================================================
+
+.. _amdgpu_synid_tfe:
+
+tfe
+~~~
+
+Controls access to partially resident textures. The default value is off (0).
+
+See AMD documentation for details.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    tfe                                      Set tfe bit to 1.
+    ======================================== ================================================
+
+MUBUF/MTBUF Modifiers
+---------------------
+
+.. _amdgpu_synid_idxen:
+
+idxen
+~~~~~
+
+Specifies whether address components include an index. By default, no components are used.
+
+Can be used together with :ref:`offen<amdgpu_synid_offen>`.
+
+Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    idxen                                    Address components include an index.
+    ======================================== ================================================
+
+.. _amdgpu_synid_offen:
+
+offen
+~~~~~
+
+Specifies whether address components include an offset. By default, no components are used.
+
+Can be used together with :ref:`idxen<amdgpu_synid_idxen>`.
+
+Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    offen                                    Address components include an offset.
+    ======================================== ================================================
+
+.. _amdgpu_synid_addr64:
+
+addr64
+~~~~~~
+
+Specifies whether a 64-bit address is used. By default, no address is used.
+
+GFX7 only. Cannot be used with :ref:`offen<amdgpu_synid_offen>` and
+:ref:`idxen<amdgpu_synid_idxen>` modifiers.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    addr64                                   A 64-bit address is used.
+    ======================================== ================================================
+
+.. _amdgpu_synid_buf_offset12:
+
+offset12
+~~~~~~~~
+
+Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
+
+    =============================== ======================================================
+    Syntax                          Description
+    =============================== ======================================================
+    offset:{0..0xFFF}               Specifies a 12-bit unsigned offset as a positive
+                                    :ref:`integer number <amdgpu_synid_integer_number>`.
+    =============================== ======================================================
+
+Examples:
+
+.. parsed-literal::
+
+  offset:0
+  offset:0x10
+
+glc
+~~~
+
+See a description :ref:`here<amdgpu_synid_glc>`.
+
+slc
+~~~
+
+See a description :ref:`here<amdgpu_synid_slc>`.
+
+lds
+~~~
+
+See a description :ref:`here<amdgpu_synid_lds>`.
+
+dlc
+~~~
+
+See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
+
+tfe
+~~~
+
+See a description :ref:`here<amdgpu_synid_tfe>`.
+
+.. _amdgpu_synid_dfmt:
+
+dfmt
+~~~~
+
+TBD
+
+.. _amdgpu_synid_nfmt:
+
+nfmt
+~~~~
+
+TBD
+
+SMRD/SMEM Modifiers
+-------------------
+
+glc
+~~~
+
+See a description :ref:`here<amdgpu_synid_glc>`.
+
+nv
+~~
+
+See a description :ref:`here<amdgpu_synid_nv>`. GFX9 only.
+
+dlc
+~~~
+
+See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
+
+VINTRP Modifiers
+----------------
+
+.. _amdgpu_synid_high:
+
+high
+~~~~
+
+Specifies which half of the LDS word to use. Low half of LDS word is used by default.
+GFX9 and GFX10 only.
+
+    ======================================== ================================
+    Syntax                                   Description
+    ======================================== ================================
+    high                                     Use high half of LDS word.
+    ======================================== ================================
+
+DPP8 Modifiers
+--------------
+
+GFX10 only.
+
+.. _amdgpu_synid_dpp8_sel:
+
+dpp8_sel
+~~~~~~~~
+
+Selects which lane to pull data from, within a group of 8 lanes. This is a mandatory modifier.
+There is no default value.
+
+GFX10 only.
+
+The *dpp8_sel* modifier must specify exactly 8 values, each ranging from 0 to 7.
+First value selects which lane to read from to supply data into lane 0.
+Second value controls value for lane 1 and so on.
+
+    =============================================================== ===========================
+    Syntax                                                          Description
+    =============================================================== ===========================
+    dpp8:[{0..7},{0..7},{0..7},{0..7},{0..7},{0..7},{0..7},{0..7}]  Select lanes to read from.
+    =============================================================== ===========================
+
+Examples:
+
+.. parsed-literal::
+
+  dpp8:[7,6,5,4,3,2,1,0]
+  dpp8:[0,1,0,1,0,1,0,1]
+
+.. _amdgpu_synid_fi8:
+
+fi
+~~
+
+Controls interaction with inactive lanes for *dpp8* instructions. The default value is zero.
+
+Note. *Inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
+
+GFX10 only.
+
+    ==================================== =====================================================
+    Syntax                               Description
+    ==================================== =====================================================
+    fi:0                                 Fetch zero when accessing data from inactive lanes.
+    fi:1                                 Fetch pre-exist values from inactive lanes.
+    ==================================== =====================================================
+
+DPP/DPP16 Modifiers
+-------------------
+
+GFX8, GFX9 and GFX10 only.
+
+.. _amdgpu_synid_dpp_ctrl:
+
+dpp_ctrl
+~~~~~~~~
+
+Specifies how data are shared between threads. This is a mandatory modifier.
+There is no default value.
+
+GFX8 and GFX9 only. Use :ref:`dpp16_ctrl<amdgpu_synid_dpp16_ctrl>` for GFX10.
+
+Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
+    row_mirror                               Mirror threads within row.
+    row_half_mirror                          Mirror threads within 1/2 row (8 threads).
+    row_bcast:15                             Broadcast 15th thread of each row to next row.
+    row_bcast:31                             Broadcast thread 31 to rows 2 and 3.
+    wave_shl:1                               Wavefront left shift by 1 thread.
+    wave_rol:1                               Wavefront left rotate by 1 thread.
+    wave_shr:1                               Wavefront right shift by 1 thread.
+    wave_ror:1                               Wavefront right rotate by 1 thread.
+    row_shl:{1..15}                          Row shift left by 1-15 threads.
+    row_shr:{1..15}                          Row shift right by 1-15 threads.
+    row_ror:{1..15}                          Row rotate right by 1-15 threads.
+    ======================================== ================================================
+
+Note: Numeric parameters may be specified as either
+:ref:`integer numbers<amdgpu_synid_integer_number>` or
+:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+
+Examples:
+
+.. parsed-literal::
+
+  quad_perm:[0, 1, 2, 3]
+  row_shl:3
+
+.. _amdgpu_synid_dpp16_ctrl:
+
+dpp16_ctrl
+~~~~~~~~~~
+
+Specifies how data are shared between threads. This is a mandatory modifier.
+There is no default value.
+
+GFX10 only. Use :ref:`dpp_ctrl<amdgpu_synid_dpp_ctrl>` for GFX8 and GFX9.
+
+Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
+(There are only two rows in *wave32* mode.)
+
+    ======================================== ====================================================
+    Syntax                                   Description
+    ======================================== ====================================================
+    quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
+    row_mirror                               Mirror threads within row.
+    row_half_mirror                          Mirror threads within 1/2 row (8 threads).
+    row_share:{0..15}                        Share the value from the specified lane with other
+                                             lanes in the row.
+    row_xmask:{0..15}                        Fetch from XOR(current lane id, specified lane id).
+    row_shl:{1..15}                          Row shift left by 1-15 threads.
+    row_shr:{1..15}                          Row shift right by 1-15 threads.
+    row_ror:{1..15}                          Row rotate right by 1-15 threads.
+    ======================================== ====================================================
+
+Note: Numeric parameters may be specified as either
+:ref:`integer numbers<amdgpu_synid_integer_number>` or
+:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+
+Examples:
+
+.. parsed-literal::
+
+  quad_perm:[0, 1, 2, 3]
+  row_shl:3
+
+.. _amdgpu_synid_row_mask:
+
+row_mask
+~~~~~~~~
+
+Controls which rows are enabled for data sharing. By default, all rows are enabled.
+
+Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
+(There are only two rows in *wave32* mode.)
+
+    ======================================== =====================================================
+    Syntax                                   Description
+    ======================================== =====================================================
+    row_mask:{0..15}                         Specifies a *row mask* as a positive
+                                             :ref:`integer number <amdgpu_synid_integer_number>`.
+
+                                             Each of 4 bits in the mask controls one
+                                             row (0 - disabled, 1 - enabled).
+
+                                             In *wave32* mode the values should be limited to
+                                             {0..7}.
+    ======================================== =====================================================
+
+Examples:
+
+.. parsed-literal::
+
+  row_mask:0xf
+  row_mask:0b1010
+  row_mask:0b1111
+
+.. _amdgpu_synid_bank_mask:
+
+bank_mask
+~~~~~~~~~
+
+Controls which banks are enabled for data sharing. By default, all banks are enabled.
+
+Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
+(There are only two rows in *wave32* mode.)
+
+    ======================================== =======================================================
+    Syntax                                   Description
+    ======================================== =======================================================
+    bank_mask:{0..15}                        Specifies a *bank mask* as a positive
+                                             :ref:`integer number <amdgpu_synid_integer_number>`.
+
+                                             Each of 4 bits in the mask controls one
+                                             bank (0 - disabled, 1 - enabled).
+    ======================================== =======================================================
+
+Examples:
+
+.. parsed-literal::
+
+  bank_mask:0x3
+  bank_mask:0b0011
+  bank_mask:0b1111
+
+.. _amdgpu_synid_bound_ctrl:
+
+bound_ctrl
+~~~~~~~~~~
+
+Controls data sharing when accessing an invalid lane. By default, data sharing with
+invalid lanes is disabled.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    bound_ctrl:0                             Enables data sharing with invalid lanes.
+
+                                             Accessing data from an invalid lane will
+                                             return zero.
+    ======================================== ================================================
+
+.. _amdgpu_synid_fi16:
+
+fi
+~~
+
+Controls interaction with *inactive* lanes for *dpp16* instructions. The default value is zero.
+
+Note. *Inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
+
+GFX10 only.
+
+    ======================================== ==================================================
+    Syntax                                   Description
+    ======================================== ==================================================
+    fi:0                                     Interaction with inactive lanes is controlled by
+                                             :ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`.
+
+    fi:1                                     Fetch pre-exist values from inactive lanes.
+    ======================================== ==================================================
+
+SDWA Modifiers
+--------------
+
+GFX8, GFX9 and GFX10 only.
+
+clamp
+~~~~~
+
+See a description :ref:`here<amdgpu_synid_clamp>`.
+
+omod
+~~~~
+
+See a description :ref:`here<amdgpu_synid_omod>`.
+
+GFX9 and GFX10 only.
+
+.. _amdgpu_synid_dst_sel:
+
+dst_sel
+~~~~~~~
+
+Selects which bits in the destination are affected. By default, all bits are affected.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    dst_sel:DWORD                            Use bits 31:0.
+    dst_sel:BYTE_0                           Use bits 7:0.
+    dst_sel:BYTE_1                           Use bits 15:8.
+    dst_sel:BYTE_2                           Use bits 23:16.
+    dst_sel:BYTE_3                           Use bits 31:24.
+    dst_sel:WORD_0                           Use bits 15:0.
+    dst_sel:WORD_1                           Use bits 31:16.
+    ======================================== ================================================
+
+
+.. _amdgpu_synid_dst_unused:
+
+dst_unused
+~~~~~~~~~~
+
+Controls what to do with the bits in the destination which are not selected
+by :ref:`dst_sel<amdgpu_synid_dst_sel>`.
+By default, unused bits are preserved.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    dst_unused:UNUSED_PAD                    Pad with zeros.
+    dst_unused:UNUSED_SEXT                   Sign-extend upper bits, zero lower bits.
+    dst_unused:UNUSED_PRESERVE               Preserve bits.
+    ======================================== ================================================
+
+.. _amdgpu_synid_src0_sel:
+
+src0_sel
+~~~~~~~~
+
+Controls which bits in the src0 are used. By default, all bits are used.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    src0_sel:DWORD                           Use bits 31:0.
+    src0_sel:BYTE_0                          Use bits 7:0.
+    src0_sel:BYTE_1                          Use bits 15:8.
+    src0_sel:BYTE_2                          Use bits 23:16.
+    src0_sel:BYTE_3                          Use bits 31:24.
+    src0_sel:WORD_0                          Use bits 15:0.
+    src0_sel:WORD_1                          Use bits 31:16.
+    ======================================== ================================================
+
+.. _amdgpu_synid_src1_sel:
+
+src1_sel
+~~~~~~~~
+
+Controls which bits in the src1 are used. By default, all bits are used.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    src1_sel:DWORD                           Use bits 31:0.
+    src1_sel:BYTE_0                          Use bits 7:0.
+    src1_sel:BYTE_1                          Use bits 15:8.
+    src1_sel:BYTE_2                          Use bits 23:16.
+    src1_sel:BYTE_3                          Use bits 31:24.
+    src1_sel:WORD_0                          Use bits 15:0.
+    src1_sel:WORD_1                          Use bits 31:16.
+    ======================================== ================================================
+
+.. _amdgpu_synid_sdwa_operand_modifiers:
+
+SDWA Operand Modifiers
+----------------------
+
+Operand modifiers are not used separately. They are applied to source operands.
+
+GFX8, GFX9 and GFX10 only.
+
+abs
+~~~
+
+See a description :ref:`here<amdgpu_synid_abs>`.
+
+neg
+~~~
+
+See a description :ref:`here<amdgpu_synid_neg>`.
+
+.. _amdgpu_synid_sext:
+
+sext
+~~~~
+
+Sign-extends value of a (sub-dword) operand to fill all 32 bits.
+Has no effect for 32-bit operands.
+
+Valid for integer operands only.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    sext(<operand>)                          Sign-extend operand value.
+    ======================================== ================================================
+
+Examples:
+
+.. parsed-literal::
+
+  sext(v4)
+  sext(v255)
+
+VOP3 Modifiers
+--------------
+
+.. _amdgpu_synid_vop3_op_sel:
+
+op_sel
+~~~~~~
+
+Selects the low [15:0] or high [31:16] operand bits for source and destination operands.
+By default, low bits are used for all operands.
+
+The number of values specified with the op_sel modifier must match the number of instruction
+operands (both source and destination). First value controls src0, second value controls src1
+and so on, except that the last value controls destination.
+The value 0 selects the low bits, while 1 selects the high bits.
+
+Note. op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified
+by op_sel must be 0.
+
+GFX9 and GFX10 only.
+
+    ======================================== ============================================================
+    Syntax                                   Description
+    ======================================== ============================================================
+    op_sel:[{0..1},{0..1}]                   Select operand bits for instructions with 1 source operand.
+    op_sel:[{0..1},{0..1},{0..1}]            Select operand bits for instructions with 2 source operands.
+    op_sel:[{0..1},{0..1},{0..1},{0..1}]     Select operand bits for instructions with 3 source operands.
+    ======================================== ============================================================
+
+Examples:
+
+.. parsed-literal::
+
+  op_sel:[0,0]
+  op_sel:[0,1]
+
+.. _amdgpu_synid_clamp:
+
+clamp
+~~~~~
+
+Clamp meaning depends on instruction.
+
+For *v_cmp* instructions, clamp modifier indicates that the compare signals
+if a floating point exception occurs. By default, signaling is disabled.
+Not supported by GFX7.
+
+For integer operations, clamp modifier indicates that the result must be clamped
+to the largest and smallest representable value. By default, there is no clamping.
+Integer clamping is not supported by GFX7.
+
+For floating point operations, clamp modifier indicates that the result must be clamped
+to the range [0.0, 1.0]. By default, there is no clamping.
+
+Note. Clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any).
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    clamp                                    Enables clamping (or signaling).
+    ======================================== ================================================
+
+.. _amdgpu_synid_omod:
+
+omod
+~~~~
+
+Specifies if an output modifier must be applied to the result.
+By default, no output modifiers are applied.
+
+Note. Output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any).
+
+Output modifiers are valid for f32 and f64 floating point results only.
+They must not be used with f16.
+
+Note. *v_cvt_f16_f32* is an exception. This instruction produces f16 result
+but accepts output modifiers.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    mul:2                                    Multiply the result by 2.
+    mul:4                                    Multiply the result by 4.
+    div:2                                    Multiply the result by 0.5.
+    ======================================== ================================================
+
+.. _amdgpu_synid_vop3_operand_modifiers:
+
+VOP3 Operand Modifiers
+----------------------
+
+Operand modifiers are not used separately. They are applied to source operands.
+
+.. _amdgpu_synid_abs:
+
+abs
+~~~
+
+Computes absolute value of its operand. Applied before :ref:`neg<amdgpu_synid_neg>` (if any).
+Valid for floating point operands only.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    abs(<operand>)                           Get absolute value of operand.
+    \|<operand>|                             The same as above.
+    ======================================== ================================================
+
+Examples:
+
+.. parsed-literal::
+
+  abs(v36)
+  \|v36|
+
+.. _amdgpu_synid_neg:
+
+neg
+~~~
+
+Computes negative value of its operand. Applied after :ref:`abs<amdgpu_synid_abs>` (if any).
+Valid for floating point operands only.
+
+    ======================================== ================================================
+    Syntax                                   Description
+    ======================================== ================================================
+    neg(<operand>)                           Get negative value of operand.
+    -<operand>                               The same as above.
+    ======================================== ================================================
+
+Examples:
+
+.. parsed-literal::
+
+  neg(v[0])
+  -v4
+
+VOP3P Modifiers
+---------------
+
+This section describes modifiers of *regular* VOP3P instructions.
+
+*v_mad_mix_f32*, *v_mad_mixhi_f16* and *v_mad_mixlo_f16*
+instructions use these modifiers :ref:`in a special manner<amdgpu_synid_mad_mix>`.
+
+GFX9 and GFX10 only.
+
+.. _amdgpu_synid_op_sel:
+
+op_sel
+~~~~~~
+
+Selects the low [15:0] or high [31:16] operand bits as input to the operation
+which results in the lower-half of the destination.
+By default, low bits are used for all operands.
+
+The number of values specified by the *op_sel* modifier must match the number of source
+operands. First value controls src0, second value controls src1 and so on.
+
+The value 0 selects the low bits, while 1 selects the high bits.
+
+    ================================= =============================================================
+    Syntax                            Description
+    ================================= =============================================================
+    op_sel:[{0..1}]                   Select operand bits for instructions with 1 source operand.
+    op_sel:[{0..1},{0..1}]            Select operand bits for instructions with 2 source operands.
+    op_sel:[{0..1},{0..1},{0..1}]     Select operand bits for instructions with 3 source operands.
+    ================================= =============================================================
+
+Examples:
+
+.. parsed-literal::
+
+  op_sel:[0,0]
+  op_sel:[0,1,0]
+
+.. _amdgpu_synid_op_sel_hi:
+
+op_sel_hi
+~~~~~~~~~
+
+Selects the low [15:0] or high [31:16] operand bits as input to the operation
+which results in the upper-half of the destination.
+By default, high bits are used for all operands.
+
+The number of values specified by the *op_sel_hi* modifier must match the number of source
+operands. First value controls src0, second value controls src1 and so on.
+
+The value 0 selects the low bits, while 1 selects the high bits.
+
+    =================================== =============================================================
+    Syntax                              Description
+    =================================== =============================================================
+    op_sel_hi:[{0..1}]                  Select operand bits for instructions with 1 source operand.
+    op_sel_hi:[{0..1},{0..1}]           Select operand bits for instructions with 2 source operands.
+    op_sel_hi:[{0..1},{0..1},{0..1}]    Select operand bits for instructions with 3 source operands.
+    =================================== =============================================================
+
+Examples:
+
+.. parsed-literal::
+
+  op_sel_hi:[0,0]
+  op_sel_hi:[0,0,1]
+
+.. _amdgpu_synid_neg_lo:
+
+neg_lo
+~~~~~~
+
+Specifies whether to change sign of operand values selected by
+:ref:`op_sel<amdgpu_synid_op_sel>`. These values are then used
+as input to the operation which results in the upper-half of the destination.
+
+The number of values specified by this modifier must match the number of source
+operands. First value controls src0, second value controls src1 and so on.
+
+The value 0 indicates that the corresponding operand value is used unmodified,
+the value 1 indicates that negative value of the operand must be used.
+
+By default, operand values are used unmodified.
+
+This modifier is valid for floating point operands only.
+
+    ================================ ==================================================================
+    Syntax                           Description
+    ================================ ==================================================================
+    neg_lo:[{0..1}]                  Select affected operands for instructions with 1 source operand.
+    neg_lo:[{0..1},{0..1}]           Select affected operands for instructions with 2 source operands.
+    neg_lo:[{0..1},{0..1},{0..1}]    Select affected operands for instructions with 3 source operands.
+    ================================ ==================================================================
+
+Examples:
+
+.. parsed-literal::
+
+  neg_lo:[0]
+  neg_lo:[0,1]
+
+.. _amdgpu_synid_neg_hi:
+
+neg_hi
+~~~~~~
+
+Specifies whether to change sign of operand values selected by
+:ref:`op_sel_hi<amdgpu_synid_op_sel_hi>`. These values are then used
+as input to the operation which results in the upper-half of the destination.
+
+The number of values specified by this modifier must match the number of source
+operands. First value controls src0, second value controls src1 and so on.
+
+The value 0 indicates that the corresponding operand value is used unmodified,
+the value 1 indicates that negative value of the operand must be used.
+
+By default, operand values are used unmodified.
+
+This modifier is valid for floating point operands only.
+
+    =============================== ==================================================================
+    Syntax                          Description
+    =============================== ==================================================================
+    neg_hi:[{0..1}]                 Select affected operands for instructions with 1 source operand.
+    neg_hi:[{0..1},{0..1}]          Select affected operands for instructions with 2 source operands.
+    neg_hi:[{0..1},{0..1},{0..1}]   Select affected operands for instructions with 3 source operands.
+    =============================== ==================================================================
+
+Examples:
+
+.. parsed-literal::
+
+  neg_hi:[1,0]
+  neg_hi:[0,1,1]
+
+clamp
+~~~~~
+
+See a description :ref:`here<amdgpu_synid_clamp>`.
+
+.. _amdgpu_synid_mad_mix:
+
+VOP3P V_MAD_MIX Modifiers
+-------------------------
+
+*v_mad_mix_f32*, *v_mad_mixhi_f16* and *v_mad_mixlo_f16* instructions
+use *op_sel* and *op_sel_hi* modifiers 
+in a manner different from *regular* VOP3P instructions.
+
+See a description below.
+
+GFX9 and GFX10 only.
+
+.. _amdgpu_synid_mad_mix_op_sel:
+
+m_op_sel
+~~~~~~~~
+
+This operand has meaning only for 16-bit source operands as indicated by
+:ref:`m_op_sel_hi<amdgpu_synid_mad_mix_op_sel_hi>`.
+It specifies to select either the low [15:0] or high [31:16] operand bits
+as input to the operation.
+
+The number of values specified by the *op_sel* modifier must match the number of source
+operands. First value controls src0, second value controls src1 and so on.
+
+The value 0 indicates the low bits, the value 1 indicates the high 16 bits.
+
+By default, low bits are used for all operands.
+
+    =============================== ================================================
+    Syntax                          Description
+    =============================== ================================================
+    op_sel:[{0..1},{0..1},{0..1}]   Select location of each 16-bit source operand.
+    =============================== ================================================
+
+Examples:
+
+.. parsed-literal::
+
+  op_sel:[0,1]
+
+.. _amdgpu_synid_mad_mix_op_sel_hi:
+
+m_op_sel_hi
+~~~~~~~~~~~
+
+Selects the size of source operands: either 32 bits or 16 bits.
+By default, 32 bits are used for all source operands.
+
+The number of values specified by the *op_sel_hi* modifier must match the number of source
+operands. First value controls src0, second value controls src1 and so on.
+
+The value 0 indicates 32 bits, the value 1 indicates 16 bits.
+
+The location of 16 bits in the operand may be specified by
+:ref:`m_op_sel<amdgpu_synid_mad_mix_op_sel>`.
+
+    ======================================== ====================================
+    Syntax                                   Description
+    ======================================== ====================================
+    op_sel_hi:[{0..1},{0..1},{0..1}]         Select size of each source operand.
+    ======================================== ====================================
+
+Examples:
+
+.. parsed-literal::
+
+  op_sel_hi:[1,1,1]
+
+abs
+~~~
+
+See a description :ref:`here<amdgpu_synid_abs>`.
+
+neg
+~~~
+
+See a description :ref:`here<amdgpu_synid_neg>`.
+
+clamp
+~~~~~
+
+See a description :ref:`here<amdgpu_synid_clamp>`.

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPUOperandSyntax.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPUOperandSyntax.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPUOperandSyntax.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPUOperandSyntax.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,1111 @@
+=====================================
+Syntax of AMDGPU Instruction Operands
+=====================================
+
+.. contents::
+   :local:
+
+Conventions
+===========
+
+The following notation is used throughout this document:
+
+    =================== =============================================================================
+    Notation            Description
+    =================== =============================================================================
+    {0..N}              Any integer value in the range from 0 to N (inclusive).
+    <x>                 Syntax and meaning of *x* is explained elsewhere.
+    =================== =============================================================================
+
+.. _amdgpu_syn_operands:
+
+Operands
+========
+
+.. _amdgpu_synid_v:
+
+v
+-
+
+Vector registers. There are 256 32-bit vector registers.
+
+A sequence of *vector* registers may be used to operate with more than 32 bits of data.
+
+Assembler currently supports sequences of 1, 2, 3, 4, 8 and 16 *vector* registers.
+
+    =================================================== ====================================================================
+    Syntax                                              Description
+    =================================================== ====================================================================
+    **v**\<N>                                           A single 32-bit *vector* register.
+
+                                                        *N* must be a decimal integer number.
+    **v[**\ <N>\ **]**                                  A single 32-bit *vector* register.
+
+                                                        *N* may be specified as an
+                                                        :ref:`integer number<amdgpu_synid_integer_number>`
+                                                        or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
+    **v[**\ <N>:<K>\ **]**                              A sequence of (\ *K-N+1*\ ) *vector* registers.
+
+                                                        *N* and *K* may be specified as
+                                                        :ref:`integer numbers<amdgpu_synid_integer_number>`
+                                                        or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+    **[v**\ <N>, \ **v**\ <N+1>, ... **v**\ <K>\ **]**  A sequence of (\ *K-N+1*\ ) *vector* registers.
+
+                                                        Register indices must be specified as decimal integer numbers.
+    =================================================== ====================================================================
+
+Note. *N* and *K* must satisfy the following conditions:
+
+* *N* <= *K*.
+* 0 <= *N* <= 255.
+* 0 <= *K* <= 255.
+* *K-N+1* must be equal to 1, 2, 3, 4, 8 or 16.
+
+Examples:
+
+.. parsed-literal::
+
+  v255
+  v[0]
+  v[0:1]
+  v[1:1]
+  v[0:3]
+  v[2*2]
+  v[1-1:2-1]
+  [v252]
+  [v252,v253,v254,v255]
+
+.. _amdgpu_synid_nsa:
+
+*Image* instructions may use special *NSA* (Non-Sequential Address) syntax for *image addresses*:
+
+    =================================================== ====================================================================
+    Syntax                                              Description
+    =================================================== ====================================================================
+    **[v**\ <A>, \ **v**\ <B>, ... **v**\ <X>\ **]**    A sequence of *vector* registers. At least one register
+                                                        must be specified.
+
+                                                        In contrast with standard syntax described above, registers in
+                                                        this sequence are not required to have consecutive indices.
+                                                        Moreover, the same register may appear in the list more than once.
+    =================================================== ====================================================================
+
+Note. Reqister indices must be in the range 0..255. They must be specified as decimal integer numbers.
+
+Examples:
+
+.. parsed-literal::
+
+  [v32,v1,v2]
+  [v4,v4,v4,v4]
+
+.. _amdgpu_synid_s:
+
+s
+-
+
+Scalar 32-bit registers. The number of available *scalar* registers depends on GPU:
+
+    ======= ============================
+    GPU     Number of *scalar* registers
+    ======= ============================
+    GFX7    104
+    GFX8    102
+    GFX9    102
+    GFX10   106
+    ======= ============================
+
+A sequence of *scalar* registers may be used to operate with more than 32 bits of data.
+Assembler currently supports sequences of 1, 2, 4, 8 and 16 *scalar* registers.
+
+Pairs of *scalar* registers must be even-aligned (the first register must be even).
+Sequences of 4 and more *scalar* registers must be quad-aligned.
+
+    ======================================================== ====================================================================
+    Syntax                                                   Description
+    ======================================================== ====================================================================
+    **s**\ <N>                                               A single 32-bit *scalar* register.
+
+                                                             *N* must be a decimal integer number.
+    **s[**\ <N>\ **]**                                       A single 32-bit *scalar* register.
+
+                                                             *N* may be specified as an
+                                                             :ref:`integer number<amdgpu_synid_integer_number>`
+                                                             or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
+    **s[**\ <N>:<K>\ **]**                                   A sequence of (\ *K-N+1*\ ) *scalar* registers.
+
+                                                             *N* and *K* may be specified as
+                                                             :ref:`integer numbers<amdgpu_synid_integer_number>`
+                                                             or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+    **[s**\ <N>, \ **s**\ <N+1>, ... **s**\ <K>\ **]**       A sequence of (\ *K-N+1*\ ) *scalar* registers.
+
+                                                             Register indices must be specified as decimal integer numbers.
+    ======================================================== ====================================================================
+
+Note. *N* and *K* must satisfy the following conditions:
+
+* *N* must be properly aligned based on sequence size.
+* *N* <= *K*.
+* 0 <= *N* < *SMAX*\ , where *SMAX* is the number of available *scalar* registers.
+* 0 <= *K* < *SMAX*\ , where *SMAX* is the number of available *scalar* registers.
+* *K-N+1* must be equal to 1, 2, 4, 8 or 16.
+
+Examples:
+
+.. parsed-literal::
+
+  s0
+  s[0]
+  s[0:1]
+  s[1:1]
+  s[0:3]
+  s[2*2]
+  s[1-1:2-1]
+  [s4]
+  [s4,s5,s6,s7]
+
+Examples of *scalar* registers with an invalid alignment:
+
+.. parsed-literal::
+
+  s[1:2]
+  s[2:5]
+
+.. _amdgpu_synid_trap:
+
+trap
+----
+
+A set of trap handler registers:
+
+* :ref:`ttmp<amdgpu_synid_ttmp>`
+* :ref:`tba<amdgpu_synid_tba>`
+* :ref:`tma<amdgpu_synid_tma>`
+
+.. _amdgpu_synid_ttmp:
+
+ttmp
+----
+
+Trap handler temporary scalar registers, 32-bits wide.
+The number of available *ttmp* registers depends on GPU:
+
+    ======= ===========================
+    GPU     Number of *ttmp* registers
+    ======= ===========================
+    GFX7    12
+    GFX8    12
+    GFX9    16
+    GFX10   16
+    ======= ===========================
+
+A sequence of *ttmp* registers may be used to operate with more than 32 bits of data.
+Assembler currently supports sequences of 1, 2, 4, 8 and 16 *ttmp* registers.
+
+Pairs of *ttmp* registers must be even-aligned (the first register must be even).
+Sequences of 4 and more *ttmp* registers must be quad-aligned.
+
+    ============================================================= ====================================================================
+    Syntax                                                        Description
+    ============================================================= ====================================================================
+    **ttmp**\ <N>                                                 A single 32-bit *ttmp* register.
+
+                                                                  *N* must be a decimal integer number.
+    **ttmp[**\ <N>\ **]**                                         A single 32-bit *ttmp* register.
+
+                                                                  *N* may be specified as an
+                                                                  :ref:`integer number<amdgpu_synid_integer_number>`
+                                                                  or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
+    **ttmp[**\ <N>:<K>\ **]**                                     A sequence of (\ *K-N+1*\ ) *ttmp* registers.
+
+                                                                  *N* and *K* may be specified as
+                                                                  :ref:`integer numbers<amdgpu_synid_integer_number>`
+                                                                  or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+    **[ttmp**\ <N>, \ **ttmp**\ <N+1>, ... **ttmp**\ <K>\ **]**   A sequence of (\ *K-N+1*\ ) *ttmp* registers.
+
+                                                                  Register indices must be specified as decimal integer numbers.
+    ============================================================= ====================================================================
+
+Note. *N* and *K* must satisfy the following conditions:
+
+* *N* must be properly aligned based on sequence size.
+* *N* <= *K*.
+* 0 <= *N* < *TMAX*, where *TMAX* is the number of available *ttmp* registers.
+* 0 <= *K* < *TMAX*, where *TMAX* is the number of available *ttmp* registers.
+* *K-N+1* must be equal to 1, 2, 4, 8 or 16.
+
+Examples:
+
+.. parsed-literal::
+
+  ttmp0
+  ttmp[0]
+  ttmp[0:1]
+  ttmp[1:1]
+  ttmp[0:3]
+  ttmp[2*2]
+  ttmp[1-1:2-1]
+  [ttmp4]
+  [ttmp4,ttmp5,ttmp6,ttmp7]
+
+Examples of *ttmp* registers with an invalid alignment:
+
+.. parsed-literal::
+
+  ttmp[1:2]
+  ttmp[2:5]
+
+.. _amdgpu_synid_tba:
+
+tba
+---
+
+Trap base address, 64-bits wide. Holds the pointer to the current trap handler program.
+
+    ================== ======================================================================= =============
+    Syntax             Description                                                             Availability
+    ================== ======================================================================= =============
+    tba                64-bit *trap base address* register.                                    GFX7, GFX8
+    [tba]              64-bit *trap base address* register (an alternative syntax).            GFX7, GFX8
+    [tba_lo,tba_hi]    64-bit *trap base address* register (an alternative syntax).            GFX7, GFX8
+    ================== ======================================================================= =============
+
+High and low 32 bits of *trap base address* may be accessed as separate registers:
+
+    ================== ======================================================================= =============
+    Syntax             Description                                                             Availability
+    ================== ======================================================================= =============
+    tba_lo             Low 32 bits of *trap base address* register.                            GFX7, GFX8
+    tba_hi             High 32 bits of *trap base address* register.                           GFX7, GFX8
+    [tba_lo]           Low 32 bits of *trap base address* register (an alternative syntax).    GFX7, GFX8
+    [tba_hi]           High 32 bits of *trap base address* register (an alternative syntax).   GFX7, GFX8
+    ================== ======================================================================= =============
+
+Note that *tba*, *tba_lo* and *tba_hi* are not accessible as assembler registers in GFX9 and GFX10,
+but *tba* is readable/writable with the help of *s_get_reg* and *s_set_reg* instructions.
+
+.. _amdgpu_synid_tma:
+
+tma
+---
+
+Trap memory address, 64-bits wide.
+
+    ================= ======================================================================= ==================
+    Syntax            Description                                                             Availability
+    ================= ======================================================================= ==================
+    tma               64-bit *trap memory address* register.                                  GFX7, GFX8
+    [tma]             64-bit *trap memory address* register (an alternative syntax).          GFX7, GFX8
+    [tma_lo,tma_hi]   64-bit *trap memory address* register (an alternative syntax).          GFX7, GFX8
+    ================= ======================================================================= ==================
+
+High and low 32 bits of *trap memory address* may be accessed as separate registers:
+
+    ================= ======================================================================= ==================
+    Syntax            Description                                                             Availability
+    ================= ======================================================================= ==================
+    tma_lo            Low 32 bits of *trap memory address* register.                          GFX7, GFX8
+    tma_hi            High 32 bits of *trap memory address* register.                         GFX7, GFX8
+    [tma_lo]          Low 32 bits of *trap memory address* register (an alternative syntax).  GFX7, GFX8
+    [tma_hi]          High 32 bits of *trap memory address* register (an alternative syntax). GFX7, GFX8
+    ================= ======================================================================= ==================
+
+Note that *tma*, *tma_lo* and *tma_hi* are not accessible as assembler registers in GFX9 and GFX10,
+but *tma* is readable/writable with the help of *s_get_reg* and *s_set_reg* instructions.
+
+.. _amdgpu_synid_flat_scratch:
+
+flat_scratch
+------------
+
+Flat scratch address, 64-bits wide. Holds the base address of scratch memory.
+
+    ================================== ================================================================
+    Syntax                             Description
+    ================================== ================================================================
+    flat_scratch                       64-bit *flat scratch* address register.
+    [flat_scratch]                     64-bit *flat scratch* address register (an alternative syntax).
+    [flat_scratch_lo,flat_scratch_hi]  64-bit *flat scratch* address register (an alternative syntax).
+    ================================== ================================================================
+
+High and low 32 bits of *flat scratch* address may be accessed as separate registers:
+
+    ========================= =========================================================================
+    Syntax                    Description
+    ========================= =========================================================================
+    flat_scratch_lo           Low 32 bits of *flat scratch* address register.
+    flat_scratch_hi           High 32 bits of *flat scratch* address register.
+    [flat_scratch_lo]         Low 32 bits of *flat scratch* address register (an alternative syntax).
+    [flat_scratch_hi]         High 32 bits of *flat scratch* address register (an alternative syntax).
+    ========================= =========================================================================
+
+.. _amdgpu_synid_xnack:
+
+xnack
+-----
+
+Xnack mask, 64-bits wide. Holds a 64-bit mask of which threads
+received an *XNACK* due to a vector memory operation.
+
+.. WARNING:: GFX7 does not support *xnack* feature. For availability of this feature in other GPUs, refer :ref:`this table<amdgpu-processors>`.
+
+\
+
+    ============================== =====================================================
+    Syntax                         Description
+    ============================== =====================================================
+    xnack_mask                     64-bit *xnack mask* register.
+    [xnack_mask]                   64-bit *xnack mask* register (an alternative syntax).
+    [xnack_mask_lo,xnack_mask_hi]  64-bit *xnack mask* register (an alternative syntax).
+    ============================== =====================================================
+
+High and low 32 bits of *xnack mask* may be accessed as separate registers:
+
+    ===================== ==============================================================
+    Syntax                Description
+    ===================== ==============================================================
+    xnack_mask_lo         Low 32 bits of *xnack mask* register.
+    xnack_mask_hi         High 32 bits of *xnack mask* register.
+    [xnack_mask_lo]       Low 32 bits of *xnack mask* register (an alternative syntax).
+    [xnack_mask_hi]       High 32 bits of *xnack mask* register (an alternative syntax).
+    ===================== ==============================================================
+
+.. _amdgpu_synid_vcc:
+.. _amdgpu_synid_vcc_lo:
+
+vcc
+---
+
+Vector condition code, 64-bits wide. A bit mask with one bit per thread;
+it holds the result of a vector compare operation.
+
+Note that GFX10 H/W does not use high 32 bits of *vcc* in *wave32* mode.
+
+    ================ =========================================================================
+    Syntax           Description
+    ================ =========================================================================
+    vcc              64-bit *vector condition code* register.
+    [vcc]            64-bit *vector condition code* register (an alternative syntax).
+    [vcc_lo,vcc_hi]  64-bit *vector condition code* register (an alternative syntax).
+    ================ =========================================================================
+
+High and low 32 bits of *vector condition code* may be accessed as separate registers:
+
+    ================ =========================================================================
+    Syntax           Description
+    ================ =========================================================================
+    vcc_lo           Low 32 bits of *vector condition code* register.
+    vcc_hi           High 32 bits of *vector condition code* register.
+    [vcc_lo]         Low 32 bits of *vector condition code* register (an alternative syntax).
+    [vcc_hi]         High 32 bits of *vector condition code* register (an alternative syntax).
+    ================ =========================================================================
+
+.. _amdgpu_synid_m0:
+
+m0
+--
+
+A 32-bit memory register. It has various uses,
+including register indexing and bounds checking.
+
+    =========== ===================================================
+    Syntax      Description
+    =========== ===================================================
+    m0          A 32-bit *memory* register.
+    [m0]        A 32-bit *memory* register (an alternative syntax).
+    =========== ===================================================
+
+.. _amdgpu_synid_exec:
+
+exec
+----
+
+Execute mask, 64-bits wide. A bit mask with one bit per thread,
+which is applied to vector instructions and controls which threads execute
+and which ignore the instruction.
+
+Note that GFX10 H/W does not use high 32 bits of *exec* in *wave32* mode.
+
+    ===================== =================================================================
+    Syntax                Description
+    ===================== =================================================================
+    exec                  64-bit *execute mask* register.
+    [exec]                64-bit *execute mask* register (an alternative syntax).
+    [exec_lo,exec_hi]     64-bit *execute mask* register (an alternative syntax).
+    ===================== =================================================================
+
+High and low 32 bits of *execute mask* may be accessed as separate registers:
+
+    ===================== =================================================================
+    Syntax                Description
+    ===================== =================================================================
+    exec_lo               Low 32 bits of *execute mask* register.
+    exec_hi               High 32 bits of *execute mask* register.
+    [exec_lo]             Low 32 bits of *execute mask* register (an alternative syntax).
+    [exec_hi]             High 32 bits of *execute mask* register (an alternative syntax).
+    ===================== =================================================================
+
+.. _amdgpu_synid_vccz:
+
+vccz
+----
+
+A single bit flag indicating that the :ref:`vcc<amdgpu_synid_vcc>` is all zeros.
+
+Note. When GFX10 operates in *wave32* mode, this register reflects state of :ref:`vcc_lo<amdgpu_synid_vcc_lo>`.
+
+.. _amdgpu_synid_execz:
+
+execz
+-----
+
+A single bit flag indicating that the :ref:`exec<amdgpu_synid_exec>` is all zeros.
+
+Note. When GFX10 operates in *wave32* mode, this register reflects state of :ref:`exec_lo<amdgpu_synid_exec>`.
+
+.. _amdgpu_synid_scc:
+
+scc
+---
+
+A single bit flag indicating the result of a scalar compare operation.
+
+.. _amdgpu_synid_lds_direct:
+
+lds_direct
+----------
+
+A special operand which supplies a 32-bit value
+fetched from *LDS* memory using :ref:`m0<amdgpu_synid_m0>` as an address.
+
+.. _amdgpu_synid_null:
+
+null
+----
+
+This is a special operand which may be used as a source or a destination.
+
+When used as a destination, the result of the operation is discarded.
+
+When used as a source, it supplies zero value.
+
+GFX10 only.
+
+.. WARNING:: Due to a H/W bug, this operand cannot be used with VALU instructions in first generation of GFX10.
+
+.. _amdgpu_synid_constant:
+
+constant
+--------
+
+A set of integer and floating-point *inline* constants and values:
+
+* :ref:`iconst<amdgpu_synid_iconst>`
+* :ref:`fconst<amdgpu_synid_fconst>`
+* :ref:`ival<amdgpu_synid_ival>`
+
+In contrast with :ref:`literals<amdgpu_synid_literal>`, these operands are encoded as a part of instruction.
+
+If a number may be encoded as either
+a :ref:`literal<amdgpu_synid_literal>` or 
+a :ref:`constant<amdgpu_synid_constant>`,
+assembler selects the latter encoding as more efficient.
+
+.. _amdgpu_synid_iconst:
+
+iconst
+~~~~~~
+
+An :ref:`integer number<amdgpu_synid_integer_number>`
+encoded as an *inline constant*.
+
+Only a small fraction of integer numbers may be encoded as *inline constants*.
+They are enumerated in the table below.
+Other integer numbers have to be encoded as :ref:`literals<amdgpu_synid_literal>`.
+
+Integer *inline constants* are converted to
+:ref:`expected operand type<amdgpu_syn_instruction_type>`
+as described :ref:`here<amdgpu_synid_int_const_conv>`.
+
+    ================================== ====================================
+    Value                              Note
+    ================================== ====================================
+    {0..64}                            Positive integer inline constants.
+    {-16..-1}                          Negative integer inline constants.
+    ================================== ====================================
+
+.. WARNING:: GFX7 does not support inline constants for *f16* operands.
+
+.. _amdgpu_synid_fconst:
+
+fconst
+~~~~~~
+
+A :ref:`floating-point number<amdgpu_synid_floating-point_number>`
+encoded as an *inline constant*.
+
+Only a small fraction of floating-point numbers may be encoded as *inline constants*.
+They are enumerated in the table below.
+Other floating-point numbers have to be encoded as :ref:`literals<amdgpu_synid_literal>`.
+
+Floating-point *inline constants* are converted to
+:ref:`expected operand type<amdgpu_syn_instruction_type>`
+as described :ref:`here<amdgpu_synid_fp_const_conv>`.
+
+    ===================== ===================================================== ==================
+    Value                 Note                                                  Availability
+    ===================== ===================================================== ==================
+    0.0                   The same as integer constant 0.                       All GPUs
+    0.5                   Floating-point constant 0.5                           All GPUs
+    1.0                   Floating-point constant 1.0                           All GPUs
+    2.0                   Floating-point constant 2.0                           All GPUs
+    4.0                   Floating-point constant 4.0                           All GPUs
+    -0.5                  Floating-point constant -0.5                          All GPUs
+    -1.0                  Floating-point constant -1.0                          All GPUs
+    -2.0                  Floating-point constant -2.0                          All GPUs
+    -4.0                  Floating-point constant -4.0                          All GPUs
+    0.1592                1.0/(2.0*pi). Use only for 16-bit operands.           GFX8, GFX9, GFX10
+    0.15915494            1.0/(2.0*pi). Use only for 16- and 32-bit operands.   GFX8, GFX9, GFX10
+    0.15915494309189532   1.0/(2.0*pi).                                         GFX8, GFX9, GFX10
+    ===================== ===================================================== ==================
+
+.. WARNING:: GFX7 does not support inline constants for *f16* operands.
+
+.. _amdgpu_synid_ival:
+
+ival
+~~~~
+
+A symbolic operand encoded as an *inline constant*.
+These operands provide read-only access to H/W registers.
+
+    ======================== ================================================ =============
+    Syntax                   Note                                             Availability
+    ======================== ================================================ =============
+    shared_base              Base address of shared memory region.            GFX9, GFX10
+    shared_limit             Address of the end of shared memory region.      GFX9, GFX10
+    private_base             Base address of private memory region.           GFX9, GFX10
+    private_limit            Address of the end of private memory region.     GFX9, GFX10
+    pops_exiting_wave_id     A dedicated counter for POPS.                    GFX9, GFX10
+    ======================== ================================================ =============
+
+.. _amdgpu_synid_literal:
+
+literal
+-------
+
+A literal is a 64-bit value which is encoded as a separate 32-bit dword in the instruction stream.
+
+If a number may be encoded as either
+a :ref:`literal<amdgpu_synid_literal>` or 
+an :ref:`inline constant<amdgpu_synid_constant>`,
+assembler selects the latter encoding as more efficient.
+
+Literals may be specified as :ref:`integer numbers<amdgpu_synid_integer_number>`,
+:ref:`floating-point numbers<amdgpu_synid_floating-point_number>` or
+:ref:`expressions<amdgpu_synid_expression>`
+(expressions are currently supported for 32-bit operands only).
+
+A 64-bit literal value is converted by assembler
+to an :ref:`expected operand type<amdgpu_syn_instruction_type>`
+as described :ref:`here<amdgpu_synid_lit_conv>`.
+
+An instruction may use only one literal but several operands may refer the same literal.
+
+.. _amdgpu_synid_uimm8:
+
+uimm8
+-----
+
+A 8-bit positive :ref:`integer number<amdgpu_synid_integer_number>`.
+The value is encoded as part of the opcode so it is free to use.
+
+.. _amdgpu_synid_uimm32:
+
+uimm32
+------
+
+A 32-bit positive :ref:`integer number<amdgpu_synid_integer_number>`.
+The value is stored as a separate 32-bit dword in the instruction stream.
+
+.. _amdgpu_synid_uimm20:
+
+uimm20
+------
+
+A 20-bit positive :ref:`integer number<amdgpu_synid_integer_number>`.
+
+.. _amdgpu_synid_uimm21:
+
+uimm21
+------
+
+A 21-bit positive :ref:`integer number<amdgpu_synid_integer_number>`.
+
+.. WARNING:: Assembler currently supports 20-bit offsets only. Use :ref:`uimm20<amdgpu_synid_uimm20>` as a replacement.
+
+.. _amdgpu_synid_simm21:
+
+simm21
+------
+
+A 21-bit :ref:`integer number<amdgpu_synid_integer_number>`.
+
+.. WARNING:: Assembler currently supports 20-bit unsigned offsets only. Use :ref:`uimm20<amdgpu_synid_uimm20>` as a replacement.
+
+.. _amdgpu_synid_off:
+
+off
+---
+
+A special entity which indicates that the value of this operand is not used.
+
+    ================================== ===================================================
+    Syntax                             Description
+    ================================== ===================================================
+    off                                Indicates an unused operand.
+    ================================== ===================================================
+
+
+.. _amdgpu_synid_number:
+
+Numbers
+=======
+
+.. _amdgpu_synid_integer_number:
+
+Integer Numbers
+---------------
+
+Integer numbers are 64 bits wide.
+They may be specified in binary, octal, hexadecimal and decimal formats:
+
+    ============== ====================================
+    Format         Syntax
+    ============== ====================================
+    Decimal        [-]?[1-9][0-9]*
+    Binary         [-]?0b[01]+
+    Octal          [-]?0[0-7]+
+    Hexadecimal    [-]?0x[0-9a-fA-F]+
+    \              [-]?[0x]?[0-9][0-9a-fA-F]*[hH]
+    ============== ====================================
+
+Examples:
+
+.. parsed-literal::
+
+  -1234
+  0b1010
+  010
+  0xff
+  0ffh
+
+.. _amdgpu_synid_floating-point_number:
+
+Floating-Point Numbers
+----------------------
+
+All floating-point numbers are handled as double (64 bits wide).
+
+Floating-point numbers may be specified in hexadecimal and decimal formats:
+
+    ============== ======================================================== ========================================================
+    Format         Syntax                                                   Note
+    ============== ======================================================== ========================================================
+    Decimal        [-]?[0-9]*[.][0-9]*([eE][+-]?[0-9]*)?                    Must include either a decimal separator or an exponent.
+    Hexadecimal    [-]0x[0-9a-fA-F]*(.[0-9a-fA-F]*)?[pP][+-]?[0-9a-fA-F]+
+    ============== ======================================================== ========================================================
+
+Examples:
+
+.. parsed-literal::
+
+ -1.234
+ 234e2
+ -0x1afp-10
+ 0x.1afp10
+
+.. _amdgpu_synid_expression:
+
+Expressions
+===========
+
+An expression specifies an address or a numeric value.
+There are two kinds of expressions:
+
+* :ref:`Absolute<amdgpu_synid_absolute_expression>`.
+* :ref:`Relocatable<amdgpu_synid_relocatable_expression>`.
+
+.. _amdgpu_synid_absolute_expression:
+
+Absolute Expressions
+--------------------
+
+The value of an absolute expression remains the same after program relocation.
+Absolute expressions must not include unassigned and relocatable values
+such as labels.
+
+Examples:
+
+.. parsed-literal::
+
+    x = -1
+    y = x + 10
+
+.. _amdgpu_synid_relocatable_expression:
+
+Relocatable Expressions
+-----------------------
+
+The value of a relocatable expression depends on program relocation.
+
+Note that use of relocatable expressions is limited with branch targets
+and 32-bit :ref:`literals<amdgpu_synid_literal>`.
+
+Addition information about relocation may be found :ref:`here<amdgpu-relocation-records>`.
+
+Examples:
+
+.. parsed-literal::
+
+    y = x + 10 // x is not yet defined. Undefined symbols are assumed to be PC-relative.
+    z = .
+
+Expression Data Type
+--------------------
+
+Expressions and operands of expressions are interpreted as 64-bit integers.
+
+Expressions may include 64-bit :ref:`floating-point numbers<amdgpu_synid_floating-point_number>` (double).
+However these operands are also handled as 64-bit integers
+using binary representation of specified floating-point numbers.
+No conversion from floating-point to integer is performed.
+
+Examples:
+
+.. parsed-literal::
+
+    x = 0.1    // x is assigned an integer 4591870180066957722 which is a binary representation of 0.1.
+    y = x + x  // y is a sum of two integer values; it is not equal to 0.2!
+
+Syntax
+------
+
+Expressions are composed of
+:ref:`symbols<amdgpu_synid_symbol>`,
+:ref:`integer numbers<amdgpu_synid_integer_number>`,
+:ref:`floating-point numbers<amdgpu_synid_floating-point_number>`,
+:ref:`binary operators<amdgpu_synid_expression_bin_op>`,
+:ref:`unary operators<amdgpu_synid_expression_un_op>` and subexpressions.
+
+Expressions may also use "." which is a reference to the current PC (program counter).
+
+The syntax of expressions is shown below::
+
+    expr ::= expr binop expr | primaryexpr ;
+
+    primaryexpr ::= '(' expr ')' | symbol | number | '.' | unop primaryexpr ;
+
+    binop ::= '&&'
+            | '||'
+            | '|'
+            | '^'
+            | '&'
+            | '!'
+            | '=='
+            | '!='
+            | '<>'
+            | '<'
+            | '<='
+            | '>'
+            | '>='
+            | '<<'
+            | '>>'
+            | '+'
+            | '-'
+            | '*'
+            | '/'
+            | '%' ;
+
+    unop ::= '~'
+           | '+'
+           | '-'
+           | '!' ;
+
+.. _amdgpu_synid_expression_bin_op:
+
+Binary Operators
+----------------
+
+Binary operators are described in the following table.
+They operate on and produce 64-bit integers.
+Operators with higher priority are performed first.
+
+    ========== ========= ===============================================
+    Operator   Priority  Meaning
+    ========== ========= ===============================================
+       \*         5      Integer multiplication.
+       /          5      Integer division.
+       %          5      Integer signed remainder.
+       \+         4      Integer addition.
+       \-         4      Integer subtraction.
+       <<         3      Integer shift left.
+       >>         3      Logical shift right.
+       ==         2      Equality comparison.
+       !=         2      Inequality comparison.
+       <>         2      Inequality comparison.
+       <          2      Signed less than comparison.
+       <=         2      Signed less than or equal comparison.
+       >          2      Signed greater than comparison.
+       >=         2      Signed greater than or equal comparison.
+      \|          1      Bitwise or.
+       ^          1      Bitwise xor.
+       &          1      Bitwise and.
+       &&         0      Logical and.
+       ||         0      Logical or.
+    ========== ========= ===============================================
+
+.. _amdgpu_synid_expression_un_op:
+
+Unary Operators
+---------------
+
+Unary operators are described in the following table.
+They operate on and produce 64-bit integers.
+
+    ========== ===============================================
+    Operator   Meaning
+    ========== ===============================================
+       !       Logical negation.
+       ~       Bitwise negation.
+       \+      Integer unary plus.
+       \-      Integer unary minus.
+    ========== ===============================================
+
+.. _amdgpu_synid_symbol:
+
+Symbols
+-------
+
+A symbol is a named 64-bit value, representing a relocatable
+address or an absolute (non-relocatable) number.
+
+Symbol names have the following syntax:
+    ``[a-zA-Z_.][a-zA-Z0-9_$.@]*``
+
+The table below provides several examples of syntax used for symbol definition.
+
+    ================ ==========================================================
+    Syntax           Meaning
+    ================ ==========================================================
+    .globl <S>       Declares a global symbol S without assigning it a value.
+    .set <S>, <E>    Assigns the value of an expression E to a symbol S.
+    <S> = <E>        Assigns the value of an expression E to a symbol S.
+    <S>:             Declares a label S and assigns it the current PC value.
+    ================ ==========================================================
+
+A symbol may be used before it is declared or assigned;
+unassigned symbols are assumed to be PC-relative.
+
+Addition information about symbols may be found :ref:`here<amdgpu-symbols>`.
+
+.. _amdgpu_synid_conv:
+
+Conversions
+===========
+
+This section describes what happens when a 64-bit
+:ref:`integer number<amdgpu_synid_integer_number>`, a
+:ref:`floating-point numbers<amdgpu_synid_floating-point_number>` or a
+:ref:`symbol<amdgpu_synid_symbol>`
+is used for an operand which has a different type or size.
+
+Depending on operand kind, this conversion is performed by either assembler or AMDGPU H/W:
+
+* Values encoded as :ref:`inline constants<amdgpu_synid_constant>` are handled by H/W.
+* Values encoded as :ref:`literals<amdgpu_synid_literal>` are converted by assembler.
+
+.. _amdgpu_synid_const_conv:
+
+Inline Constants
+----------------
+
+.. _amdgpu_synid_int_const_conv:
+
+Integer Inline Constants
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Integer :ref:`inline constants<amdgpu_synid_constant>`
+may be thought of as 64-bit
+:ref:`integer numbers<amdgpu_synid_integer_number>`;
+when used as operands they are truncated to the size of
+:ref:`expected operand type<amdgpu_syn_instruction_type>`.
+No data type conversions are performed.
+
+Examples:
+
+.. parsed-literal::
+
+    // GFX9
+
+    v_add_u16 v0, -1, 0    // v0 = 0xFFFF
+    v_add_f16 v0, -1, 0    // v0 = 0xFFFF (NaN)
+
+    v_add_u32 v0, -1, 0    // v0 = 0xFFFFFFFF
+    v_add_f32 v0, -1, 0    // v0 = 0xFFFFFFFF (NaN)
+
+.. _amdgpu_synid_fp_const_conv:
+
+Floating-Point Inline Constants
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Floating-point :ref:`inline constants<amdgpu_synid_constant>`
+may be thought of as 64-bit
+:ref:`floating-point numbers<amdgpu_synid_floating-point_number>`;
+when used as operands they are converted to a floating-point number of
+:ref:`expected operand size<amdgpu_syn_instruction_type>`.
+
+Examples:
+
+.. parsed-literal::
+
+    // GFX9
+
+    v_add_f16 v0, 1.0, 0    // v0 = 0x3C00 (1.0)
+    v_add_u16 v0, 1.0, 0    // v0 = 0x3C00
+
+    v_add_f32 v0, 1.0, 0    // v0 = 0x3F800000 (1.0)
+    v_add_u32 v0, 1.0, 0    // v0 = 0x3F800000
+
+
+.. _amdgpu_synid_lit_conv:
+
+Literals
+--------
+
+.. _amdgpu_synid_int_lit_conv:
+
+Integer Literals
+~~~~~~~~~~~~~~~~
+
+Integer :ref:`literals<amdgpu_synid_literal>`
+are specified as 64-bit :ref:`integer numbers<amdgpu_synid_integer_number>`.
+
+When used as operands they are converted to
+:ref:`expected operand type<amdgpu_syn_instruction_type>` as described below.
+
+    ============== ============== =============== ====================================================================
+    Expected type  Condition      Result          Note
+    ============== ============== =============== ====================================================================
+    i16, u16, b16  cond(num,16)   num.u16         Truncate to 16 bits.
+    i32, u32, b32  cond(num,32)   num.u32         Truncate to 32 bits.
+    i64            cond(num,32)   {-1,num.i32}    Truncate to 32 bits and then sign-extend the result to 64 bits.
+    u64, b64       cond(num,32)   { 0,num.u32}    Truncate to 32 bits and then zero-extend the result to 64 bits.
+    f16            cond(num,16)   num.u16         Use low 16 bits as an f16 value.
+    f32            cond(num,32)   num.u32         Use low 32 bits as an f32 value.
+    f64            cond(num,32)   {num.u32,0}     Use low 32 bits of the number as high 32 bits
+                                                  of the result; low 32 bits of the result are zeroed.
+    ============== ============== =============== ====================================================================
+
+The condition *cond(X,S)* indicates if a 64-bit number *X*
+can be converted to a smaller size *S* by truncation of upper bits.
+There are two cases when the conversion is possible:
+
+* The truncated bits are all 0.
+* The truncated bits are all 1 and the value after truncation has its MSB bit set.
+
+Examples of valid literals:
+
+.. parsed-literal::
+
+    // GFX9
+                                             // Literal value after conversion:
+    v_add_u16 v0, 0xff00, v0                 //   0xff00
+    v_add_u16 v0, 0xffffffffffffff00, v0     //   0xff00
+    v_add_u16 v0, -256, v0                   //   0xff00
+                                             // Literal value after conversion:
+    s_bfe_i64 s[0:1], 0xffefffff, s3         //   0xffffffffffefffff
+    s_bfe_u64 s[0:1], 0xffefffff, s3         //   0x00000000ffefffff
+    v_ceil_f64_e32 v[0:1], 0xffefffff        //   0xffefffff00000000 (-1.7976922776554302e308)
+
+Examples of invalid literals:
+
+.. parsed-literal::
+
+    // GFX9
+
+    v_add_u16 v0, 0x1ff00, v0               // truncated bits are not all 0 or 1
+    v_add_u16 v0, 0xffffffffffff00ff, v0    // truncated bits do not match MSB of the result
+
+.. _amdgpu_synid_fp_lit_conv:
+
+Floating-Point Literals
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Floating-point :ref:`literals<amdgpu_synid_literal>` are specified as 64-bit
+:ref:`floating-point numbers<amdgpu_synid_floating-point_number>`.
+
+When used as operands they are converted to
+:ref:`expected operand type<amdgpu_syn_instruction_type>` as described below.
+
+    ============== ============== ================= =================================================================
+    Expected type  Condition      Result            Note
+    ============== ============== ================= =================================================================
+    i16, u16, b16  cond(num,16)   f16(num)          Convert to f16 and use bits of the result as an integer value.
+    i32, u32, b32  cond(num,32)   f32(num)          Convert to f32 and use bits of the result as an integer value.
+    i64, u64, b64  false          \-                Conversion disabled because of an unclear semantics.
+    f16            cond(num,16)   f16(num)          Convert to f16.
+    f32            cond(num,32)   f32(num)          Convert to f32.
+    f64            true           {num.u32.hi,0}    Use high 32 bits of the number as high 32 bits of the result;
+                                                    zero-fill low 32 bits of the result.
+
+                                                    Note that the result may differ from the original number.
+    ============== ============== ================= =================================================================
+
+The condition *cond(X,S)* indicates if an f64 number *X* can be converted
+to a smaller *S*-bit floating-point type without overflow or underflow.
+Precision lost is allowed.
+
+Examples of valid literals:
+
+.. parsed-literal::
+
+    // GFX9
+
+    v_add_f16 v1, 65500.0, v2
+    v_add_f32 v1, 65600.0, v2
+
+    // Literal value before conversion: 1.7976931348623157e308 (0x7fefffffffffffff)
+    // Literal value after conversion:  1.7976922776554302e308 (0x7fefffff00000000)
+    v_ceil_f64 v[0:1], 1.7976931348623157e308
+
+Examples of invalid literals:
+
+.. parsed-literal::
+
+    // GFX9
+
+    v_add_f16 v1, 65600.0, v2    // overflow
+
+.. _amdgpu_synid_exp_conv:
+
+Expressions
+~~~~~~~~~~~
+
+Expressions operate with and result in 64-bit integers.
+
+When used as operands they are truncated to
+:ref:`expected operand size<amdgpu_syn_instruction_type>`.
+No data type conversions are performed.
+
+Examples:
+
+.. parsed-literal::
+
+    // GFX9
+
+    x = 0.1
+    v_sqrt_f32 v0, x           // v0 = [low 32 bits of 0.1 (double)]
+    v_sqrt_f32 v0, (0.1 + 0)   // the same as above
+    v_sqrt_f32 v0, 0.1         // v0 = [0.1 (double) converted to float]
+

Added: www-releases/trunk/9.0.0/docs/_sources/AMDGPUUsage.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AMDGPUUsage.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AMDGPUUsage.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AMDGPUUsage.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,6504 @@
+=============================
+User Guide for AMDGPU Backend
+=============================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+The AMDGPU backend provides ISA code generation for AMD GPUs, starting with the
+R600 family up until the current GCN families. It lives in the
+``lib/Target/AMDGPU`` directory.
+
+LLVM
+====
+
+.. _amdgpu-target-triples:
+
+Target Triples
+--------------
+
+Use the ``clang -target <Architecture>-<Vendor>-<OS>-<Environment>`` option to
+specify the target triple:
+
+  .. table:: AMDGPU Architectures
+     :name: amdgpu-architecture-table
+
+     ============ ==============================================================
+     Architecture Description
+     ============ ==============================================================
+     ``r600``     AMD GPUs HD2XXX-HD6XXX for graphics and compute shaders.
+     ``amdgcn``   AMD GPUs GCN GFX6 onwards for graphics and compute shaders.
+     ============ ==============================================================
+
+  .. table:: AMDGPU Vendors
+     :name: amdgpu-vendor-table
+
+     ============ ==============================================================
+     Vendor       Description
+     ============ ==============================================================
+     ``amd``      Can be used for all AMD GPU usage.
+     ``mesa3d``   Can be used if the OS is ``mesa3d``.
+     ============ ==============================================================
+
+  .. table:: AMDGPU Operating Systems
+     :name: amdgpu-os-table
+
+     ============== ============================================================
+     OS             Description
+     ============== ============================================================
+     *<empty>*      Defaults to the *unknown* OS.
+     ``amdhsa``     Compute kernels executed on HSA [HSA]_ compatible runtimes
+                    such as AMD's ROCm [AMD-ROCm]_.
+     ``amdpal``     Graphic shaders and compute kernels executed on AMD PAL
+                    runtime.
+     ``mesa3d``     Graphic shaders and compute kernels executed on Mesa 3D
+                    runtime.
+     ============== ============================================================
+
+  .. table:: AMDGPU Environments
+     :name: amdgpu-environment-table
+
+     ============ ==============================================================
+     Environment  Description
+     ============ ==============================================================
+     *<empty>*    Default.
+     ============ ==============================================================
+
+.. _amdgpu-processors:
+
+Processors
+----------
+
+Use the ``clang -mcpu <Processor>`` option to specify the AMD GPU processor. The
+names from both the *Processor* and *Alternative Processor* can be used.
+
+  .. table:: AMDGPU Processors
+     :name: amdgpu-processor-table
+
+     =========== =============== ============ ===== ================= ======= ======================
+     Processor   Alternative     Target       dGPU/ Target            ROCm    Example
+                 Processor       Triple       APU   Features          Support Products
+                                 Architecture       Supported
+                                                    [Default]
+     =========== =============== ============ ===== ================= ======= ======================
+     **Radeon HD 2000/3000 Series (R600)** [AMD-RADEON-HD-2000-3000]_
+     -----------------------------------------------------------------------------------------------
+     ``r600``                    ``r600``     dGPU
+     ``r630``                    ``r600``     dGPU
+     ``rs880``                   ``r600``     dGPU
+     ``rv670``                   ``r600``     dGPU
+     **Radeon HD 4000 Series (R700)** [AMD-RADEON-HD-4000]_
+     -----------------------------------------------------------------------------------------------
+     ``rv710``                   ``r600``     dGPU
+     ``rv730``                   ``r600``     dGPU
+     ``rv770``                   ``r600``     dGPU
+     **Radeon HD 5000 Series (Evergreen)** [AMD-RADEON-HD-5000]_
+     -----------------------------------------------------------------------------------------------
+     ``cedar``                   ``r600``     dGPU
+     ``cypress``                 ``r600``     dGPU
+     ``juniper``                 ``r600``     dGPU
+     ``redwood``                 ``r600``     dGPU
+     ``sumo``                    ``r600``     dGPU
+     **Radeon HD 6000 Series (Northern Islands)** [AMD-RADEON-HD-6000]_
+     -----------------------------------------------------------------------------------------------
+     ``barts``                   ``r600``     dGPU
+     ``caicos``                  ``r600``     dGPU
+     ``cayman``                  ``r600``     dGPU
+     ``turks``                   ``r600``     dGPU
+     **GCN GFX6 (Southern Islands (SI))** [AMD-GCN-GFX6]_
+     -----------------------------------------------------------------------------------------------
+     ``gfx600``  - ``tahiti``    ``amdgcn``   dGPU
+     ``gfx601``  - ``hainan``    ``amdgcn``   dGPU
+                 - ``oland``
+                 - ``pitcairn``
+                 - ``verde``
+     **GCN GFX7 (Sea Islands (CI))** [AMD-GCN-GFX7]_
+     -----------------------------------------------------------------------------------------------
+     ``gfx700``  - ``kaveri``    ``amdgcn``   APU                             - A6-7000
+                                                                              - A6 Pro-7050B
+                                                                              - A8-7100
+                                                                              - A8 Pro-7150B
+                                                                              - A10-7300
+                                                                              - A10 Pro-7350B
+                                                                              - FX-7500
+                                                                              - A8-7200P
+                                                                              - A10-7400P
+                                                                              - FX-7600P
+     ``gfx701``  - ``hawaii``    ``amdgcn``   dGPU                    ROCm    - FirePro W8100
+                                                                              - FirePro W9100
+                                                                              - FirePro S9150
+                                                                              - FirePro S9170
+     ``gfx702``                  ``amdgcn``   dGPU                    ROCm    - Radeon R9 290
+                                                                              - Radeon R9 290x
+                                                                              - Radeon R390
+                                                                              - Radeon R390x
+     ``gfx703``  - ``kabini``    ``amdgcn``   APU                             - E1-2100
+                 - ``mullins``                                                - E1-2200
+                                                                              - E1-2500
+                                                                              - E2-3000
+                                                                              - E2-3800
+                                                                              - A4-5000
+                                                                              - A4-5100
+                                                                              - A6-5200
+                                                                              - A4 Pro-3340B
+     ``gfx704``  - ``bonaire``   ``amdgcn``   dGPU                            - Radeon HD 7790
+                                                                              - Radeon HD 8770
+                                                                              - R7 260
+                                                                              - R7 260X
+     **GCN GFX8 (Volcanic Islands (VI))** [AMD-GCN-GFX8]_
+     -----------------------------------------------------------------------------------------------
+     ``gfx801``  - ``carrizo``   ``amdgcn``   APU   - xnack                   - A6-8500P
+                                                      [on]                    - Pro A6-8500B
+                                                                              - A8-8600P
+                                                                              - Pro A8-8600B
+                                                                              - FX-8800P
+                                                                              - Pro A12-8800B
+     \                           ``amdgcn``   APU   - xnack           ROCm    - A10-8700P
+                                                      [on]                    - Pro A10-8700B
+                                                                              - A10-8780P
+     \                           ``amdgcn``   APU   - xnack                   - A10-9600P
+                                                      [on]                    - A10-9630P
+                                                                              - A12-9700P
+                                                                              - A12-9730P
+                                                                              - FX-9800P
+                                                                              - FX-9830P
+     \                           ``amdgcn``   APU   - xnack                   - E2-9010
+                                                      [on]                    - A6-9210
+                                                                              - A9-9410
+     ``gfx802``  - ``iceland``   ``amdgcn``   dGPU  - xnack           ROCm    - FirePro S7150
+                 - ``tonga``                          [off]                   - FirePro S7100
+                                                                              - FirePro W7100
+                                                                              - Radeon R285
+                                                                              - Radeon R9 380
+                                                                              - Radeon R9 385
+                                                                              - Mobile FirePro
+                                                                                M7170
+     ``gfx803``  - ``fiji``      ``amdgcn``   dGPU  - xnack           ROCm    - Radeon R9 Nano
+                                                      [off]                   - Radeon R9 Fury
+                                                                              - Radeon R9 FuryX
+                                                                              - Radeon Pro Duo
+                                                                              - FirePro S9300x2
+                                                                              - Radeon Instinct MI8
+     \           - ``polaris10`` ``amdgcn``   dGPU  - xnack           ROCm    - Radeon RX 470
+                                                      [off]                   - Radeon RX 480
+                                                                              - Radeon Instinct MI6
+     \           - ``polaris11`` ``amdgcn``   dGPU  - xnack           ROCm    - Radeon RX 460
+                                                      [off]
+     ``gfx810``  - ``stoney``    ``amdgcn``   APU   - xnack
+                                                      [on]
+     **GCN GFX9** [AMD-GCN-GFX9]_
+     -----------------------------------------------------------------------------------------------
+     ``gfx900``                  ``amdgcn``   dGPU  - xnack           ROCm    - Radeon Vega
+                                                      [off]                     Frontier Edition
+                                                                              - Radeon RX Vega 56
+                                                                              - Radeon RX Vega 64
+                                                                              - Radeon RX Vega 64
+                                                                                Liquid
+                                                                              - Radeon Instinct MI25
+     ``gfx902``                  ``amdgcn``   APU   - xnack                   - Ryzen 3 2200G
+                                                      [on]                    - Ryzen 5 2400G
+     ``gfx904``                  ``amdgcn``   dGPU  - xnack                   *TBA*
+                                                      [off]
+                                                                              .. TODO
+                                                                                 Add product
+                                                                                 names.
+     ``gfx906``                  ``amdgcn``   dGPU  - xnack                   - Radeon Instinct MI50
+                                                      [off]                   - Radeon Instinct MI60
+     ``gfx908``                  ``amdgcn``   dGPU  - xnack                   *TBA*
+                                                      [off]
+                                                      sram-ecc
+                                                      [on]
+     ``gfx909``                  ``amdgcn``   APU   - xnack                   *TBA* (Raven Ridge 2)
+                                                      [on]
+                                                                              .. TODO
+                                                                                 Add product
+                                                                                 names.
+     **GCN GFX10** [AMD-GCN-GFX10]_
+     -----------------------------------------------------------------------------------------------
+     ``gfx1010``                 ``amdgcn``   dGPU  - xnack                   *TBA*
+                                                      [off]
+                                                    - wavefrontsize64
+                                                      [off]
+                                                    - cumode
+                                                      [off]
+                                                                              .. TODO
+                                                                                 Add product
+                                                                                 names.
+     ``gfx1011``                 ``amdgcn``   dGPU  - xnack                   *TBA*
+                                                      [off]
+                                                    - wavefrontsize64
+                                                      [off]
+                                                    - cumode
+                                                      [off]
+                                                                              .. TODO
+                                                                                 Add product
+                                                                                 names.
+     ``gfx1012``                 ``amdgcn``   dGPU  - xnack                   *TBA*
+                                                      [off]
+                                                    - wavefrontsize64
+                                                      [off]
+                                                    - cumode
+                                                      [off]
+                                                                              .. TODO
+                                                                                 Add product
+                                                                                 names.
+     =========== =============== ============ ===== ================= ======= ======================
+
+.. _amdgpu-target-features:
+
+Target Features
+---------------
+
+Target features control how code is generated to support certain
+processor specific features. Not all target features are supported by
+all processors. The runtime must ensure that the features supported by
+the device used to execute the code match the features enabled when
+generating the code. A mismatch of features may result in incorrect
+execution, or a reduction in performance.
+
+The target features supported by each processor, and the default value
+used if not specified explicitly, is listed in
+:ref:`amdgpu-processor-table`.
+
+Use the ``clang -m[no-]<TargetFeature>`` option to specify the AMD GPU
+target features.
+
+For example:
+
+``-mxnack``
+  Enable the ``xnack`` feature.
+``-mno-xnack``
+  Disable the ``xnack`` feature.
+
+  .. table:: AMDGPU Target Features
+     :name: amdgpu-target-feature-table
+
+     ====================== ==================================================
+     Target Feature         Description
+     ====================== ==================================================
+     -m[no-]xnack           Enable/disable generating code that has
+                            memory clauses that are compatible with
+                            having XNACK replay enabled.
+
+                            This is used for demand paging and page
+                            migration. If XNACK replay is enabled in
+                            the device, then if a page fault occurs
+                            the code may execute incorrectly if the
+                            ``xnack`` feature is not enabled. Executing
+                            code that has the feature enabled on a
+                            device that does not have XNACK replay
+                            enabled will execute correctly, but may
+                            be less performant than code with the
+                            feature disabled.
+
+     -m[no-]sram-ecc        Enable/disable generating code that assumes SRAM
+                            ECC is enabled/disabled.
+
+     -m[no-]wavefrontsize64 Control the default wavefront size used when
+                            generating code for kernels. When disabled
+                            native wavefront size 32 is used, when enabled
+                            wavefront size 64 is used.
+
+     -m[no-]cumode          Control the default wavefront execution mode used
+                            when generating code for kernels. When disabled
+                            native WGP wavefront execution mode is used,
+                            when enabled CU wavefront execution mode is used
+                            (see :ref:`amdgpu-amdhsa-memory-model`).
+     ====================== ==================================================
+
+.. _amdgpu-address-spaces:
+
+Address Spaces
+--------------
+
+The AMDGPU backend uses the following address space mappings.
+
+The memory space names used in the table, aside from the region memory space, is
+from the OpenCL standard.
+
+LLVM Address Space number is used throughout LLVM (for example, in LLVM IR).
+
+  .. table:: Address Space Mapping
+     :name: amdgpu-address-space-mapping-table
+
+     ================== =================================
+     LLVM Address Space Memory Space
+     ================== =================================
+     0                  Generic (Flat)
+     1                  Global
+     2                  Region (GDS)
+     3                  Local (group/LDS)
+     4                  Constant
+     5                  Private (Scratch)
+     6                  Constant 32-bit
+     7                  Buffer Fat Pointer (experimental)
+     ================== =================================
+
+The buffer fat pointer is an experimental address space that is currently
+unsupported in the backend. It exposes a non-integral pointer that is in future
+intended to support the modelling of 128-bit buffer descriptors + a 32-bit
+offset into the buffer descriptor (in total encapsulating a 160-bit 'pointer'),
+allowing us to use normal LLVM load/store/atomic operations to model the buffer
+descriptors used heavily in graphics workloads targeting the backend.
+
+.. _amdgpu-memory-scopes:
+
+Memory Scopes
+-------------
+
+This section provides LLVM memory synchronization scopes supported by the AMDGPU
+backend memory model when the target triple OS is ``amdhsa`` (see
+:ref:`amdgpu-amdhsa-memory-model` and :ref:`amdgpu-target-triples`).
+
+The memory model supported is based on the HSA memory model [HSA]_ which is
+based in turn on HRF-indirect with scope inclusion [HRF]_. The happens-before
+relation is transitive over the synchonizes-with relation independent of scope,
+and synchonizes-with allows the memory scope instances to be inclusive (see
+table :ref:`amdgpu-amdhsa-llvm-sync-scopes-table`).
+
+This is different to the OpenCL [OpenCL]_ memory model which does not have scope
+inclusion and requires the memory scopes to exactly match. However, this
+is conservatively correct for OpenCL.
+
+  .. table:: AMDHSA LLVM Sync Scopes
+     :name: amdgpu-amdhsa-llvm-sync-scopes-table
+
+     ======================= ===================================================
+     LLVM Sync Scope         Description
+     ======================= ===================================================
+     *none*                  The default: ``system``.
+
+                             Synchronizes with, and participates in modification
+                             and seq_cst total orderings with, other operations
+                             (except image operations) for all address spaces
+                             (except private, or generic that accesses private)
+                             provided the other operation's sync scope is:
+
+                             - ``system``.
+                             - ``agent`` and executed by a thread on the same
+                               agent.
+                             - ``workgroup`` and executed by a thread in the
+                               same workgroup.
+                             - ``wavefront`` and executed by a thread in the
+                               same wavefront.
+
+     ``agent``               Synchronizes with, and participates in modification
+                             and seq_cst total orderings with, other operations
+                             (except image operations) for all address spaces
+                             (except private, or generic that accesses private)
+                             provided the other operation's sync scope is:
+
+                             - ``system`` or ``agent`` and executed by a thread
+                               on the same agent.
+                             - ``workgroup`` and executed by a thread in the
+                               same workgroup.
+                             - ``wavefront`` and executed by a thread in the
+                               same wavefront.
+
+     ``workgroup``           Synchronizes with, and participates in modification
+                             and seq_cst total orderings with, other operations
+                             (except image operations) for all address spaces
+                             (except private, or generic that accesses private)
+                             provided the other operation's sync scope is:
+
+                             - ``system``, ``agent`` or ``workgroup`` and
+                               executed by a thread in the same workgroup.
+                             - ``wavefront`` and executed by a thread in the
+                               same wavefront.
+
+     ``wavefront``           Synchronizes with, and participates in modification
+                             and seq_cst total orderings with, other operations
+                             (except image operations) for all address spaces
+                             (except private, or generic that accesses private)
+                             provided the other operation's sync scope is:
+
+                             - ``system``, ``agent``, ``workgroup`` or
+                               ``wavefront`` and executed by a thread in the
+                               same wavefront.
+
+     ``singlethread``        Only synchronizes with, and participates in
+                             modification and seq_cst total orderings with,
+                             other operations (except image operations) running
+                             in the same thread for all address spaces (for
+                             example, in signal handlers).
+
+     ``one-as``              Same as ``system`` but only synchronizes with other
+                             operations within the same address space.
+
+     ``agent-one-as``        Same as ``agent`` but only synchronizes with other
+                             operations within the same address space.
+
+     ``workgroup-one-as``    Same as ``workgroup`` but only synchronizes with
+                             other operations within the same address space.
+
+     ``wavefront-one-as``    Same as ``wavefront`` but only synchronizes with
+                             other operations within the same address space.
+
+     ``singlethread-one-as`` Same as ``singlethread`` but only synchronizes with
+                             other operations within the same address space.
+     ======================= ===================================================
+
+AMDGPU Intrinsics
+-----------------
+
+The AMDGPU backend implements the following LLVM IR intrinsics.
+
+*This section is WIP.*
+
+.. TODO
+   List AMDGPU intrinsics
+
+AMDGPU Attributes
+-----------------
+
+The AMDGPU backend supports the following LLVM IR attributes.
+
+  .. table:: AMDGPU LLVM IR Attributes
+     :name: amdgpu-llvm-ir-attributes-table
+
+     ======================================= ==========================================================
+     LLVM Attribute                          Description
+     ======================================= ==========================================================
+     "amdgpu-flat-work-group-size"="min,max" Specify the minimum and maximum flat work group sizes that
+                                             will be specified when the kernel is dispatched. Generated
+                                             by the ``amdgpu_flat_work_group_size`` CLANG attribute [CLANG-ATTR]_.
+     "amdgpu-implicitarg-num-bytes"="n"      Number of kernel argument bytes to add to the kernel
+                                             argument block size for the implicit arguments. This
+                                             varies by OS and language (for OpenCL see
+                                             :ref:`opencl-kernel-implicit-arguments-appended-for-amdhsa-os-table`).
+     "amdgpu-num-sgpr"="n"                   Specifies the number of SGPRs to use. Generated by
+                                             the ``amdgpu_num_sgpr`` CLANG attribute [CLANG-ATTR]_.
+     "amdgpu-num-vgpr"="n"                   Specifies the number of VGPRs to use. Generated by the
+                                             ``amdgpu_num_vgpr`` CLANG attribute [CLANG-ATTR]_.
+     "amdgpu-waves-per-eu"="m,n"             Specify the minimum and maximum number of waves per
+                                             execution unit. Generated by the ``amdgpu_waves_per_eu``
+                                             CLANG attribute [CLANG-ATTR]_.
+     "amdgpu-ieee" true/false.               Specify whether the function expects the IEEE field of the
+                                             mode register to be set on entry. Overrides the default for
+                                             the calling convention.
+     "amdgpu-dx10-clamp" true/false.         Specify whether the function expects the DX10_CLAMP field of
+                                             the mode register to be set on entry. Overrides the default
+                                             for the calling convention.
+     ======================================= ==========================================================
+
+Code Object
+===========
+
+The AMDGPU backend generates a standard ELF [ELF]_ relocatable code object that
+can be linked by ``lld`` to produce a standard ELF shared code object which can
+be loaded and executed on an AMDGPU target.
+
+Header
+------
+
+The AMDGPU backend uses the following ELF header:
+
+  .. table:: AMDGPU ELF Header
+     :name: amdgpu-elf-header-table
+
+     ========================== ===============================
+     Field                      Value
+     ========================== ===============================
+     ``e_ident[EI_CLASS]``      ``ELFCLASS64``
+     ``e_ident[EI_DATA]``       ``ELFDATA2LSB``
+     ``e_ident[EI_OSABI]``      - ``ELFOSABI_NONE``
+                                - ``ELFOSABI_AMDGPU_HSA``
+                                - ``ELFOSABI_AMDGPU_PAL``
+                                - ``ELFOSABI_AMDGPU_MESA3D``
+     ``e_ident[EI_ABIVERSION]`` - ``ELFABIVERSION_AMDGPU_HSA``
+                                - ``ELFABIVERSION_AMDGPU_PAL``
+                                - ``ELFABIVERSION_AMDGPU_MESA3D``
+     ``e_type``                 - ``ET_REL``
+                                - ``ET_DYN``
+     ``e_machine``              ``EM_AMDGPU``
+     ``e_entry``                0
+     ``e_flags``                See :ref:`amdgpu-elf-header-e_flags-table`
+     ========================== ===============================
+
+..
+
+  .. table:: AMDGPU ELF Header Enumeration Values
+     :name: amdgpu-elf-header-enumeration-values-table
+
+     =============================== =====
+     Name                            Value
+     =============================== =====
+     ``EM_AMDGPU``                   224
+     ``ELFOSABI_NONE``               0
+     ``ELFOSABI_AMDGPU_HSA``         64
+     ``ELFOSABI_AMDGPU_PAL``         65
+     ``ELFOSABI_AMDGPU_MESA3D``      66
+     ``ELFABIVERSION_AMDGPU_HSA``    1
+     ``ELFABIVERSION_AMDGPU_PAL``    0
+     ``ELFABIVERSION_AMDGPU_MESA3D`` 0
+     =============================== =====
+
+``e_ident[EI_CLASS]``
+  The ELF class is:
+
+  * ``ELFCLASS32`` for ``r600`` architecture.
+
+  * ``ELFCLASS64`` for ``amdgcn`` architecture which only supports 64
+    bit applications.
+
+``e_ident[EI_DATA]``
+  All AMDGPU targets use ``ELFDATA2LSB`` for little-endian byte ordering.
+
+``e_ident[EI_OSABI]``
+  One of the following AMD GPU architecture specific OS ABIs
+  (see :ref:`amdgpu-os-table`):
+
+  * ``ELFOSABI_NONE`` for *unknown* OS.
+
+  * ``ELFOSABI_AMDGPU_HSA`` for ``amdhsa`` OS.
+
+  * ``ELFOSABI_AMDGPU_PAL`` for ``amdpal`` OS.
+
+  * ``ELFOSABI_AMDGPU_MESA3D`` for ``mesa3D`` OS.
+
+``e_ident[EI_ABIVERSION]``
+  The ABI version of the AMD GPU architecture specific OS ABI to which the code
+  object conforms:
+
+  * ``ELFABIVERSION_AMDGPU_HSA`` is used to specify the version of AMD HSA
+    runtime ABI.
+
+  * ``ELFABIVERSION_AMDGPU_PAL`` is used to specify the version of AMD PAL
+    runtime ABI.
+
+  * ``ELFABIVERSION_AMDGPU_MESA3D`` is used to specify the version of AMD MESA
+    3D runtime ABI.
+
+``e_type``
+  Can be one of the following values:
+
+
+  ``ET_REL``
+    The type produced by the AMD GPU backend compiler as it is relocatable code
+    object.
+
+  ``ET_DYN``
+    The type produced by the linker as it is a shared code object.
+
+  The AMD HSA runtime loader requires a ``ET_DYN`` code object.
+
+``e_machine``
+  The value ``EM_AMDGPU`` is used for the machine for all processors supported
+  by the ``r600`` and ``amdgcn`` architectures (see
+  :ref:`amdgpu-processor-table`). The specific processor is specified in the
+  ``EF_AMDGPU_MACH`` bit field of the ``e_flags`` (see
+  :ref:`amdgpu-elf-header-e_flags-table`).
+
+``e_entry``
+  The entry point is 0 as the entry points for individual kernels must be
+  selected in order to invoke them through AQL packets.
+
+``e_flags``
+  The AMDGPU backend uses the following ELF header flags:
+
+  .. table:: AMDGPU ELF Header ``e_flags``
+     :name: amdgpu-elf-header-e_flags-table
+
+     ================================= ========== =============================
+     Name                              Value      Description
+     ================================= ========== =============================
+     **AMDGPU Processor Flag**                    See :ref:`amdgpu-processor-table`.
+     -------------------------------------------- -----------------------------
+     ``EF_AMDGPU_MACH``                0x000000ff AMDGPU processor selection
+                                                  mask for
+                                                  ``EF_AMDGPU_MACH_xxx`` values
+                                                  defined in
+                                                  :ref:`amdgpu-ef-amdgpu-mach-table`.
+     ``EF_AMDGPU_XNACK``               0x00000100 Indicates if the ``xnack``
+                                                  target feature is
+                                                  enabled for all code
+                                                  contained in the code object.
+                                                  If the processor
+                                                  does not support the
+                                                  ``xnack`` target
+                                                  feature then must
+                                                  be 0.
+                                                  See
+                                                  :ref:`amdgpu-target-features`.
+     ``EF_AMDGPU_SRAM_ECC``            0x00000200 Indicates if the ``sram-ecc``
+                                                  target feature is
+                                                  enabled for all code
+                                                  contained in the code object.
+                                                  If the processor
+                                                  does not support the
+                                                  ``sram-ecc`` target
+                                                  feature then must
+                                                  be 0.
+                                                  See
+                                                  :ref:`amdgpu-target-features`.
+     ================================= ========== =============================
+
+  .. table:: AMDGPU ``EF_AMDGPU_MACH`` Values
+     :name: amdgpu-ef-amdgpu-mach-table
+
+     ================================= ========== =============================
+     Name                              Value      Description (see
+                                                  :ref:`amdgpu-processor-table`)
+     ================================= ========== =============================
+     ``EF_AMDGPU_MACH_NONE``           0x000      *not specified*
+     ``EF_AMDGPU_MACH_R600_R600``      0x001      ``r600``
+     ``EF_AMDGPU_MACH_R600_R630``      0x002      ``r630``
+     ``EF_AMDGPU_MACH_R600_RS880``     0x003      ``rs880``
+     ``EF_AMDGPU_MACH_R600_RV670``     0x004      ``rv670``
+     ``EF_AMDGPU_MACH_R600_RV710``     0x005      ``rv710``
+     ``EF_AMDGPU_MACH_R600_RV730``     0x006      ``rv730``
+     ``EF_AMDGPU_MACH_R600_RV770``     0x007      ``rv770``
+     ``EF_AMDGPU_MACH_R600_CEDAR``     0x008      ``cedar``
+     ``EF_AMDGPU_MACH_R600_CYPRESS``   0x009      ``cypress``
+     ``EF_AMDGPU_MACH_R600_JUNIPER``   0x00a      ``juniper``
+     ``EF_AMDGPU_MACH_R600_REDWOOD``   0x00b      ``redwood``
+     ``EF_AMDGPU_MACH_R600_SUMO``      0x00c      ``sumo``
+     ``EF_AMDGPU_MACH_R600_BARTS``     0x00d      ``barts``
+     ``EF_AMDGPU_MACH_R600_CAICOS``    0x00e      ``caicos``
+     ``EF_AMDGPU_MACH_R600_CAYMAN``    0x00f      ``cayman``
+     ``EF_AMDGPU_MACH_R600_TURKS``     0x010      ``turks``
+     *reserved*                        0x011 -    Reserved for ``r600``
+                                       0x01f      architecture processors.
+     ``EF_AMDGPU_MACH_AMDGCN_GFX600``  0x020      ``gfx600``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX601``  0x021      ``gfx601``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX700``  0x022      ``gfx700``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX701``  0x023      ``gfx701``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX702``  0x024      ``gfx702``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX703``  0x025      ``gfx703``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX704``  0x026      ``gfx704``
+     *reserved*                        0x027      Reserved.
+     ``EF_AMDGPU_MACH_AMDGCN_GFX801``  0x028      ``gfx801``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX802``  0x029      ``gfx802``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX803``  0x02a      ``gfx803``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX810``  0x02b      ``gfx810``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX900``  0x02c      ``gfx900``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX902``  0x02d      ``gfx902``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX904``  0x02e      ``gfx904``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX906``  0x02f      ``gfx906``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX908``  0x030      ``gfx908``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX909``  0x031      ``gfx909``
+     *reserved*                        0x032      Reserved.
+     ``EF_AMDGPU_MACH_AMDGCN_GFX1010`` 0x033      ``gfx1010``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX1011`` 0x034      ``gfx1011``
+     ``EF_AMDGPU_MACH_AMDGCN_GFX1012`` 0x035      ``gfx1012``
+     ================================= ========== =============================
+
+Sections
+--------
+
+An AMDGPU target ELF code object has the standard ELF sections which include:
+
+  .. table:: AMDGPU ELF Sections
+     :name: amdgpu-elf-sections-table
+
+     ================== ================ =================================
+     Name               Type             Attributes
+     ================== ================ =================================
+     ``.bss``           ``SHT_NOBITS``   ``SHF_ALLOC`` + ``SHF_WRITE``
+     ``.data``          ``SHT_PROGBITS`` ``SHF_ALLOC`` + ``SHF_WRITE``
+     ``.debug_``\ *\**  ``SHT_PROGBITS`` *none*
+     ``.dynamic``       ``SHT_DYNAMIC``  ``SHF_ALLOC``
+     ``.dynstr``        ``SHT_PROGBITS`` ``SHF_ALLOC``
+     ``.dynsym``        ``SHT_PROGBITS`` ``SHF_ALLOC``
+     ``.got``           ``SHT_PROGBITS`` ``SHF_ALLOC`` + ``SHF_WRITE``
+     ``.hash``          ``SHT_HASH``     ``SHF_ALLOC``
+     ``.note``          ``SHT_NOTE``     *none*
+     ``.rela``\ *name*  ``SHT_RELA``     *none*
+     ``.rela.dyn``      ``SHT_RELA``     *none*
+     ``.rodata``        ``SHT_PROGBITS`` ``SHF_ALLOC``
+     ``.shstrtab``      ``SHT_STRTAB``   *none*
+     ``.strtab``        ``SHT_STRTAB``   *none*
+     ``.symtab``        ``SHT_SYMTAB``   *none*
+     ``.text``          ``SHT_PROGBITS`` ``SHF_ALLOC`` + ``SHF_EXECINSTR``
+     ================== ================ =================================
+
+These sections have their standard meanings (see [ELF]_) and are only generated
+if needed.
+
+``.debug``\ *\**
+  The standard DWARF sections. See :ref:`amdgpu-dwarf` for information on the
+  DWARF produced by the AMDGPU backend.
+
+``.dynamic``, ``.dynstr``, ``.dynsym``, ``.hash``
+  The standard sections used by a dynamic loader.
+
+``.note``
+  See :ref:`amdgpu-note-records` for the note records supported by the AMDGPU
+  backend.
+
+``.rela``\ *name*, ``.rela.dyn``
+  For relocatable code objects, *name* is the name of the section that the
+  relocation records apply. For example, ``.rela.text`` is the section name for
+  relocation records associated with the ``.text`` section.
+
+  For linked shared code objects, ``.rela.dyn`` contains all the relocation
+  records from each of the relocatable code object's ``.rela``\ *name* sections.
+
+  See :ref:`amdgpu-relocation-records` for the relocation records supported by
+  the AMDGPU backend.
+
+``.text``
+  The executable machine code for the kernels and functions they call. Generated
+  as position independent code. See :ref:`amdgpu-code-conventions` for
+  information on conventions used in the isa generation.
+
+.. _amdgpu-note-records:
+
+Note Records
+------------
+
+The AMDGPU backend code object contains ELF note records in the ``.note``
+section. The set of generated notes and their semantics depend on the code
+object version; see :ref:`amdgpu-note-records-v2` and
+:ref:`amdgpu-note-records-v3`.
+
+As required by ``ELFCLASS32`` and ``ELFCLASS64``, minimal zero byte padding
+must be generated after the ``name`` field to ensure the ``desc`` field is 4
+byte aligned. In addition, minimal zero byte padding must be generated to
+ensure the ``desc`` field size is a multiple of 4 bytes. The ``sh_addralign``
+field of the ``.note`` section must be at least 4 to indicate at least 8 byte
+alignment.
+
+.. _amdgpu-note-records-v2:
+
+Code Object V2 Note Records (-mattr=-code-object-v3)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. warning:: Code Object V2 is not the default code object version emitted by
+  this version of LLVM. For a description of the notes generated with the
+  default configuration (Code Object V3) see :ref:`amdgpu-note-records-v3`.
+
+The AMDGPU backend code object uses the following ELF note record in the
+``.note`` section when compiling for Code Object V2 (-mattr=-code-object-v3).
+
+Additional note records may be present, but any which are not documented here
+are deprecated and should not be used.
+
+  .. table:: AMDGPU Code Object V2 ELF Note Records
+     :name: amdgpu-elf-note-records-table-v2
+
+     ===== ============================== ======================================
+     Name  Type                           Description
+     ===== ============================== ======================================
+     "AMD" ``NT_AMD_AMDGPU_HSA_METADATA`` <metadata null terminated string>
+     ===== ============================== ======================================
+
+..
+
+  .. table:: AMDGPU Code Object V2 ELF Note Record Enumeration Values
+     :name: amdgpu-elf-note-record-enumeration-values-table-v2
+
+     ============================== =====
+     Name                           Value
+     ============================== =====
+     *reserved*                       0-9
+     ``NT_AMD_AMDGPU_HSA_METADATA``    10
+     *reserved*                        11
+     ============================== =====
+
+``NT_AMD_AMDGPU_HSA_METADATA``
+  Specifies extensible metadata associated with the code objects executed on HSA
+  [HSA]_ compatible runtimes such as AMD's ROCm [AMD-ROCm]_. It is required when
+  the target triple OS is ``amdhsa`` (see :ref:`amdgpu-target-triples`). See
+  :ref:`amdgpu-amdhsa-code-object-metadata-v2` for the syntax of the code
+  object metadata string.
+
+.. _amdgpu-note-records-v3:
+
+Code Object V3 Note Records (-mattr=+code-object-v3)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The AMDGPU backend code object uses the following ELF note record in the
+``.note`` section when compiling for Code Object V3 (-mattr=+code-object-v3).
+
+Additional note records may be present, but any which are not documented here
+are deprecated and should not be used.
+
+  .. table:: AMDGPU Code Object V3 ELF Note Records
+     :name: amdgpu-elf-note-records-table-v3
+
+     ======== ============================== ======================================
+     Name     Type                           Description
+     ======== ============================== ======================================
+     "AMDGPU" ``NT_AMDGPU_METADATA``         Metadata in Message Pack [MsgPack]_
+                                             binary format.
+     ======== ============================== ======================================
+
+..
+
+  .. table:: AMDGPU Code Object V3 ELF Note Record Enumeration Values
+     :name: amdgpu-elf-note-record-enumeration-values-table-v3
+
+     ============================== =====
+     Name                           Value
+     ============================== =====
+     *reserved*                     0-31
+     ``NT_AMDGPU_METADATA``         32
+     ============================== =====
+
+``NT_AMDGPU_METADATA``
+  Specifies extensible metadata associated with an AMDGPU code
+  object. It is encoded as a map in the Message Pack [MsgPack]_ binary
+  data format. See :ref:`amdgpu-amdhsa-code-object-metadata-v3` for the
+  map keys defined for the ``amdhsa`` OS.
+
+.. _amdgpu-symbols:
+
+Symbols
+-------
+
+Symbols include the following:
+
+  .. table:: AMDGPU ELF Symbols
+     :name: amdgpu-elf-symbols-table
+
+     ===================== ================== ================ ==================
+     Name                  Type               Section          Description
+     ===================== ================== ================ ==================
+     *link-name*           ``STT_OBJECT``     - ``.data``      Global variable
+                                              - ``.rodata``
+					      - ``.bss``
+     *link-name*\ ``.kd``  ``STT_OBJECT``     - ``.rodata``    Kernel descriptor
+     *link-name*           ``STT_FUNC``       - ``.text``      Kernel entry point
+     *link-name*           ``STT_OBJECT``     - SHN_AMDGPU_LDS Global variable in LDS
+     ===================== ================== ================ ==================
+
+Global variable
+  Global variables both used and defined by the compilation unit.
+
+  If the symbol is defined in the compilation unit then it is allocated in the
+  appropriate section according to if it has initialized data or is readonly.
+
+  If the symbol is external then its section is ``STN_UNDEF`` and the loader
+  will resolve relocations using the definition provided by another code object
+  or explicitly defined by the runtime.
+
+  If the symbol resides in local/group memory (LDS) then its section is the
+  special processor-specific section name ``SHN_AMDGPU_LDS``, and the
+  ``st_value`` field describes alignment requirements as it does for common
+  symbols.
+
+  .. TODO
+     Add description of linked shared object symbols. Seems undefined symbols
+     are marked as STT_NOTYPE.
+
+Kernel descriptor
+  Every HSA kernel has an associated kernel descriptor. It is the address of the
+  kernel descriptor that is used in the AQL dispatch packet used to invoke the
+  kernel, not the kernel entry point. The layout of the HSA kernel descriptor is
+  defined in :ref:`amdgpu-amdhsa-kernel-descriptor`.
+
+Kernel entry point
+  Every HSA kernel also has a symbol for its machine code entry point.
+
+.. _amdgpu-relocation-records:
+
+Relocation Records
+------------------
+
+AMDGPU backend generates ``Elf64_Rela`` relocation records. Supported
+relocatable fields are:
+
+``word32``
+  This specifies a 32-bit field occupying 4 bytes with arbitrary byte
+  alignment. These values use the same byte order as other word values in the
+  AMD GPU architecture.
+
+``word64``
+  This specifies a 64-bit field occupying 8 bytes with arbitrary byte
+  alignment. These values use the same byte order as other word values in the
+  AMD GPU architecture.
+
+Following notations are used for specifying relocation calculations:
+
+**A**
+  Represents the addend used to compute the value of the relocatable field.
+
+**G**
+  Represents the offset into the global offset table at which the relocation
+  entry's symbol will reside during execution.
+
+**GOT**
+  Represents the address of the global offset table.
+
+**P**
+  Represents the place (section offset for ``et_rel`` or address for ``et_dyn``)
+  of the storage unit being relocated (computed using ``r_offset``).
+
+**S**
+  Represents the value of the symbol whose index resides in the relocation
+  entry. Relocations not using this must specify a symbol index of ``STN_UNDEF``.
+
+**B**
+  Represents the base address of a loaded executable or shared object which is
+  the difference between the ELF address and the actual load address. Relocations
+  using this are only valid in executable or shared objects.
+
+The following relocation types are supported:
+
+  .. table:: AMDGPU ELF Relocation Records
+     :name: amdgpu-elf-relocation-records-table
+
+     ========================== ======= =====  ==========  ==============================
+     Relocation Type            Kind    Value  Field       Calculation
+     ========================== ======= =====  ==========  ==============================
+     ``R_AMDGPU_NONE``                  0      *none*      *none*
+     ``R_AMDGPU_ABS32_LO``      Static, 1      ``word32``  (S + A) & 0xFFFFFFFF
+                                Dynamic
+     ``R_AMDGPU_ABS32_HI``      Static, 2      ``word32``  (S + A) >> 32
+                                Dynamic
+     ``R_AMDGPU_ABS64``         Static, 3      ``word64``  S + A
+                                Dynamic
+     ``R_AMDGPU_REL32``         Static  4      ``word32``  S + A - P
+     ``R_AMDGPU_REL64``         Static  5      ``word64``  S + A - P
+     ``R_AMDGPU_ABS32``         Static, 6      ``word32``  S + A
+                                Dynamic
+     ``R_AMDGPU_GOTPCREL``      Static  7      ``word32``  G + GOT + A - P
+     ``R_AMDGPU_GOTPCREL32_LO`` Static  8      ``word32``  (G + GOT + A - P) & 0xFFFFFFFF
+     ``R_AMDGPU_GOTPCREL32_HI`` Static  9      ``word32``  (G + GOT + A - P) >> 32
+     ``R_AMDGPU_REL32_LO``      Static  10     ``word32``  (S + A - P) & 0xFFFFFFFF
+     ``R_AMDGPU_REL32_HI``      Static  11     ``word32``  (S + A - P) >> 32
+     *reserved*                         12
+     ``R_AMDGPU_RELATIVE64``    Dynamic 13     ``word64``  B + A
+     ========================== ======= =====  ==========  ==============================
+
+``R_AMDGPU_ABS32_LO`` and ``R_AMDGPU_ABS32_HI`` are only supported by
+the ``mesa3d`` OS, which does not support ``R_AMDGPU_ABS64``.
+
+There is no current OS loader support for 32 bit programs and so
+``R_AMDGPU_ABS32`` is not used.
+
+.. _amdgpu-dwarf:
+
+DWARF
+-----
+
+Standard DWARF [DWARF]_ Version 5 sections can be generated. These contain
+information that maps the code object executable code and data to the source
+language constructs. It can be used by tools such as debuggers and profilers.
+
+Address Space Mapping
+~~~~~~~~~~~~~~~~~~~~~
+
+The following address space mapping is used:
+
+  .. table:: AMDGPU DWARF Address Space Mapping
+     :name: amdgpu-dwarf-address-space-mapping-table
+
+     =================== =================
+     DWARF Address Space Memory Space
+     =================== =================
+     1                   Private (Scratch)
+     2                   Local (group/LDS)
+     *omitted*           Global
+     *omitted*           Constant
+     *omitted*           Generic (Flat)
+     *not supported*     Region (GDS)
+     =================== =================
+
+See :ref:`amdgpu-address-spaces` for information on the memory space terminology
+used in the table.
+
+An ``address_class`` attribute is generated on pointer type DIEs to specify the
+DWARF address space of the value of the pointer when it is in the *private* or
+*local* address space. Otherwise the attribute is omitted.
+
+An ``XDEREF`` operation is generated in location list expressions for variables
+that are allocated in the *private* and *local* address space. Otherwise no
+``XDREF`` is omitted.
+
+Register Mapping
+~~~~~~~~~~~~~~~~
+
+*This section is WIP.*
+
+.. TODO
+   Define DWARF register enumeration.
+
+   If want to present a wavefront state then should expose vector registers as
+   64 wide (rather than per work-item view that LLVM uses). Either as separate
+   registers, or a 64x4 byte single register. In either case use a new LANE op
+   (akin to XDREF) to select the current lane usage in a location
+   expression. This would also allow scalar register spilling to vector register
+   lanes to be expressed (currently no debug information is being generated for
+   spilling). If choose a wide single register approach then use LANE in
+   conjunction with PIECE operation to select the dword part of the register for
+   the current lane. If the separate register approach then use LANE to select
+   the register.
+
+Source Text
+~~~~~~~~~~~
+
+Source text for online-compiled programs (e.g. those compiled by the OpenCL
+runtime) may be embedded into the DWARF v5 line table using the ``clang
+-gembed-source`` option, described in table :ref:`amdgpu-debug-options`.
+
+For example:
+
+``-gembed-source``
+  Enable the embedded source DWARF v5 extension.
+``-gno-embed-source``
+  Disable the embedded source DWARF v5 extension.
+
+  .. table:: AMDGPU Debug Options
+     :name: amdgpu-debug-options
+
+     ==================== ==================================================
+     Debug Flag           Description
+     ==================== ==================================================
+     -g[no-]embed-source  Enable/disable embedding source text in DWARF
+                          debug sections. Useful for environments where
+                          source cannot be written to disk, such as
+                          when performing online compilation.
+     ==================== ==================================================
+
+This option enables one extended content types in the DWARF v5 Line Number
+Program Header, which is used to encode embedded source.
+
+  .. table:: AMDGPU DWARF Line Number Program Header Extended Content Types
+     :name: amdgpu-dwarf-extended-content-types
+
+     ============================  ======================
+     Content Type                  Form
+     ============================  ======================
+     ``DW_LNCT_LLVM_source``       ``DW_FORM_line_strp``
+     ============================  ======================
+
+The source field will contain the UTF-8 encoded, null-terminated source text
+with ``'\n'`` line endings. When the source field is present, consumers can use
+the embedded source instead of attempting to discover the source on disk. When
+the source field is absent, consumers can access the file to get the source
+text.
+
+The above content type appears in the ``file_name_entry_format`` field of the
+line table prologue, and its corresponding value appear in the ``file_names``
+field. The current encoding of the content type is documented in table
+:ref:`amdgpu-dwarf-extended-content-types-encoding`
+
+  .. table:: AMDGPU DWARF Line Number Program Header Extended Content Types Encoding
+     :name: amdgpu-dwarf-extended-content-types-encoding
+
+     ============================  ====================
+     Content Type                  Value
+     ============================  ====================
+     ``DW_LNCT_LLVM_source``       0x2001
+     ============================  ====================
+
+.. _amdgpu-code-conventions:
+
+Code Conventions
+================
+
+This section provides code conventions used for each supported target triple OS
+(see :ref:`amdgpu-target-triples`).
+
+AMDHSA
+------
+
+This section provides code conventions used when the target triple OS is
+``amdhsa`` (see :ref:`amdgpu-target-triples`).
+
+.. _amdgpu-amdhsa-code-object-target-identification:
+
+Code Object Target Identification
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The AMDHSA OS uses the following syntax to specify the code object
+target as a single string:
+
+  ``<Architecture>-<Vendor>-<OS>-<Environment>-<Processor><Target Features>``
+
+Where:
+
+  - ``<Architecture>``, ``<Vendor>``, ``<OS>`` and ``<Environment>``
+    are the same as the *Target Triple* (see
+    :ref:`amdgpu-target-triples`).
+
+  - ``<Processor>`` is the same as the *Processor* (see
+    :ref:`amdgpu-processors`).
+
+  - ``<Target Features>`` is a list of the enabled *Target Features*
+    (see :ref:`amdgpu-target-features`), each prefixed by a plus, that
+    apply to *Processor*. The list must be in the same order as listed
+    in the table :ref:`amdgpu-target-feature-table`. Note that *Target
+    Features* must be included in the list if they are enabled even if
+    that is the default for *Processor*.
+
+For example:
+
+  ``"amdgcn-amd-amdhsa--gfx902+xnack"``
+
+.. _amdgpu-amdhsa-code-object-metadata:
+
+Code Object Metadata
+~~~~~~~~~~~~~~~~~~~~
+
+The code object metadata specifies extensible metadata associated with the code
+objects executed on HSA [HSA]_ compatible runtimes such as AMD's ROCm
+[AMD-ROCm]_. The encoding and semantics of this metadata depends on the code
+object version; see :ref:`amdgpu-amdhsa-code-object-metadata-v2` and
+:ref:`amdgpu-amdhsa-code-object-metadata-v3`.
+
+Code object metadata is specified in a note record (see
+:ref:`amdgpu-note-records`) and is required when the target triple OS is
+``amdhsa`` (see :ref:`amdgpu-target-triples`). It must contain the minimum
+information necessary to support the ROCM kernel queries. For example, the
+segment sizes needed in a dispatch packet. In addition, a high level language
+runtime may require other information to be included. For example, the AMD
+OpenCL runtime records kernel argument information.
+
+.. _amdgpu-amdhsa-code-object-metadata-v2:
+
+Code Object V2 Metadata (-mattr=-code-object-v3)
+++++++++++++++++++++++++++++++++++++++++++++++++
+
+.. warning:: Code Object V2 is not the default code object version emitted by
+  this version of LLVM. For a description of the metadata generated with the
+  default configuration (Code Object V3) see
+  :ref:`amdgpu-amdhsa-code-object-metadata-v3`.
+
+Code object V2 metadata is specified by the ``NT_AMD_AMDGPU_METADATA`` note
+record (see :ref:`amdgpu-note-records-v2`).
+
+The metadata is specified as a YAML formatted string (see [YAML]_ and
+:doc:`YamlIO`).
+
+.. TODO
+   Is the string null terminated? It probably should not if YAML allows it to
+   contain null characters, otherwise it should be.
+
+The metadata is represented as a single YAML document comprised of the mapping
+defined in table :ref:`amdgpu-amdhsa-code-object-metadata-map-table-v2` and
+referenced tables.
+
+For boolean values, the string values of ``false`` and ``true`` are used for
+false and true respectively.
+
+Additional information can be added to the mappings. To avoid conflicts, any
+non-AMD key names should be prefixed by "*vendor-name*.".
+
+  .. table:: AMDHSA Code Object V2 Metadata Map
+     :name: amdgpu-amdhsa-code-object-metadata-map-table-v2
+
+     ========== ============== ========= =======================================
+     String Key Value Type     Required? Description
+     ========== ============== ========= =======================================
+     "Version"  sequence of    Required  - The first integer is the major
+                2 integers                 version. Currently 1.
+                                         - The second integer is the minor
+                                           version. Currently 0.
+     "Printf"   sequence of              Each string is encoded information
+                strings                  about a printf function call. The
+                                         encoded information is organized as
+                                         fields separated by colon (':'):
+
+                                         ``ID:N:S[0]:S[1]:...:S[N-1]:FormatString``
+
+                                         where:
+
+                                         ``ID``
+                                           A 32 bit integer as a unique id for
+                                           each printf function call
+
+                                         ``N``
+                                           A 32 bit integer equal to the number
+                                           of arguments of printf function call
+                                           minus 1
+
+                                         ``S[i]`` (where i = 0, 1, ... , N-1)
+                                           32 bit integers for the size in bytes
+                                           of the i-th FormatString argument of
+                                           the printf function call
+
+                                         FormatString
+                                           The format string passed to the
+                                           printf function call.
+     "Kernels"  sequence of    Required  Sequence of the mappings for each
+                mapping                  kernel in the code object. See
+                                         :ref:`amdgpu-amdhsa-code-object-kernel-metadata-map-table-v2`
+                                         for the definition of the mapping.
+     ========== ============== ========= =======================================
+
+..
+
+  .. table:: AMDHSA Code Object V2 Kernel Metadata Map
+     :name: amdgpu-amdhsa-code-object-kernel-metadata-map-table-v2
+
+     ================= ============== ========= ================================
+     String Key        Value Type     Required? Description
+     ================= ============== ========= ================================
+     "Name"            string         Required  Source name of the kernel.
+     "SymbolName"      string         Required  Name of the kernel
+                                                descriptor ELF symbol.
+     "Language"        string                   Source language of the kernel.
+                                                Values include:
+
+                                                - "OpenCL C"
+                                                - "OpenCL C++"
+                                                - "HCC"
+                                                - "OpenMP"
+
+     "LanguageVersion" sequence of              - The first integer is the major
+                       2 integers                 version.
+                                                - The second integer is the
+                                                  minor version.
+     "Attrs"           mapping                  Mapping of kernel attributes.
+                                                See
+                                                :ref:`amdgpu-amdhsa-code-object-kernel-attribute-metadata-map-table-v2`
+                                                for the mapping definition.
+     "Args"            sequence of              Sequence of mappings of the
+                       mapping                  kernel arguments. See
+                                                :ref:`amdgpu-amdhsa-code-object-kernel-argument-metadata-map-table-v2`
+                                                for the definition of the mapping.
+     "CodeProps"       mapping                  Mapping of properties related to
+                                                the kernel code. See
+                                                :ref:`amdgpu-amdhsa-code-object-kernel-code-properties-metadata-map-table-v2`
+                                                for the mapping definition.
+     ================= ============== ========= ================================
+
+..
+
+  .. table:: AMDHSA Code Object V2 Kernel Attribute Metadata Map
+     :name: amdgpu-amdhsa-code-object-kernel-attribute-metadata-map-table-v2
+
+     =================== ============== ========= ==============================
+     String Key          Value Type     Required? Description
+     =================== ============== ========= ==============================
+     "ReqdWorkGroupSize" sequence of              If not 0, 0, 0 then all values
+                         3 integers               must be >=1 and the dispatch
+                                                  work-group size X, Y, Z must
+                                                  correspond to the specified
+                                                  values. Defaults to 0, 0, 0.
+
+                                                  Corresponds to the OpenCL
+                                                  ``reqd_work_group_size``
+                                                  attribute.
+     "WorkGroupSizeHint" sequence of              The dispatch work-group size
+                         3 integers               X, Y, Z is likely to be the
+                                                  specified values.
+
+                                                  Corresponds to the OpenCL
+                                                  ``work_group_size_hint``
+                                                  attribute.
+     "VecTypeHint"       string                   The name of a scalar or vector
+                                                  type.
+
+                                                  Corresponds to the OpenCL
+                                                  ``vec_type_hint`` attribute.
+
+     "RuntimeHandle"     string                   The external symbol name
+                                                  associated with a kernel.
+                                                  OpenCL runtime allocates a
+                                                  global buffer for the symbol
+                                                  and saves the kernel's address
+                                                  to it, which is used for
+                                                  device side enqueueing. Only
+                                                  available for device side
+                                                  enqueued kernels.
+     =================== ============== ========= ==============================
+
+..
+
+  .. table:: AMDHSA Code Object V2 Kernel Argument Metadata Map
+     :name: amdgpu-amdhsa-code-object-kernel-argument-metadata-map-table-v2
+
+     ================= ============== ========= ================================
+     String Key        Value Type     Required? Description
+     ================= ============== ========= ================================
+     "Name"            string                   Kernel argument name.
+     "TypeName"        string                   Kernel argument type name.
+     "Size"            integer        Required  Kernel argument size in bytes.
+     "Align"           integer        Required  Kernel argument alignment in
+                                                bytes. Must be a power of two.
+     "ValueKind"       string         Required  Kernel argument kind that
+                                                specifies how to set up the
+                                                corresponding argument.
+                                                Values include:
+
+                                                "ByValue"
+                                                  The argument is copied
+                                                  directly into the kernarg.
+
+                                                "GlobalBuffer"
+                                                  A global address space pointer
+                                                  to the buffer data is passed
+                                                  in the kernarg.
+
+                                                "DynamicSharedPointer"
+                                                  A group address space pointer
+                                                  to dynamically allocated LDS
+                                                  is passed in the kernarg.
+
+                                                "Sampler"
+                                                  A global address space
+                                                  pointer to a S# is passed in
+                                                  the kernarg.
+
+                                                "Image"
+                                                  A global address space
+                                                  pointer to a T# is passed in
+                                                  the kernarg.
+
+                                                "Pipe"
+                                                  A global address space pointer
+                                                  to an OpenCL pipe is passed in
+                                                  the kernarg.
+
+                                                "Queue"
+                                                  A global address space pointer
+                                                  to an OpenCL device enqueue
+                                                  queue is passed in the
+                                                  kernarg.
+
+                                                "HiddenGlobalOffsetX"
+                                                  The OpenCL grid dispatch
+                                                  global offset for the X
+                                                  dimension is passed in the
+                                                  kernarg.
+
+                                                "HiddenGlobalOffsetY"
+                                                  The OpenCL grid dispatch
+                                                  global offset for the Y
+                                                  dimension is passed in the
+                                                  kernarg.
+
+                                                "HiddenGlobalOffsetZ"
+                                                  The OpenCL grid dispatch
+                                                  global offset for the Z
+                                                  dimension is passed in the
+                                                  kernarg.
+
+                                                "HiddenNone"
+                                                  An argument that is not used
+                                                  by the kernel. Space needs to
+                                                  be left for it, but it does
+                                                  not need to be set up.
+
+                                                "HiddenPrintfBuffer"
+                                                  A global address space pointer
+                                                  to the runtime printf buffer
+                                                  is passed in kernarg.
+
+                                                "HiddenDefaultQueue"
+                                                  A global address space pointer
+                                                  to the OpenCL device enqueue
+                                                  queue that should be used by
+                                                  the kernel by default is
+                                                  passed in the kernarg.
+
+                                                "HiddenCompletionAction"
+                                                  A global address space pointer
+                                                  to help link enqueued kernels into
+                                                  the ancestor tree for determining
+                                                  when the parent kernel has finished.
+
+                                                "HiddenMultiGridSyncArg"
+                                                  A global address space pointer for
+                                                  multi-grid synchronization is
+                                                  passed in the kernarg.
+
+     "ValueType"       string         Required  Kernel argument value type. Only
+                                                present if "ValueKind" is
+                                                "ByValue". For vector data
+                                                types, the value is for the
+                                                element type. Values include:
+
+                                                - "Struct"
+                                                - "I8"
+                                                - "U8"
+                                                - "I16"
+                                                - "U16"
+                                                - "F16"
+                                                - "I32"
+                                                - "U32"
+                                                - "F32"
+                                                - "I64"
+                                                - "U64"
+                                                - "F64"
+
+                                                .. TODO
+                                                   How can it be determined if a
+                                                   vector type, and what size
+                                                   vector?
+     "PointeeAlign"    integer                  Alignment in bytes of pointee
+                                                type for pointer type kernel
+                                                argument. Must be a power
+                                                of 2. Only present if
+                                                "ValueKind" is
+                                                "DynamicSharedPointer".
+     "AddrSpaceQual"   string                   Kernel argument address space
+                                                qualifier. Only present if
+                                                "ValueKind" is "GlobalBuffer" or
+                                                "DynamicSharedPointer". Values
+                                                are:
+
+                                                - "Private"
+                                                - "Global"
+                                                - "Constant"
+                                                - "Local"
+                                                - "Generic"
+                                                - "Region"
+
+                                                .. TODO
+                                                   Is GlobalBuffer only Global
+                                                   or Constant? Is
+                                                   DynamicSharedPointer always
+                                                   Local? Can HCC allow Generic?
+                                                   How can Private or Region
+                                                   ever happen?
+     "AccQual"         string                   Kernel argument access
+                                                qualifier. Only present if
+                                                "ValueKind" is "Image" or
+                                                "Pipe". Values
+                                                are:
+
+                                                - "ReadOnly"
+                                                - "WriteOnly"
+                                                - "ReadWrite"
+
+                                                .. TODO
+                                                   Does this apply to
+                                                   GlobalBuffer?
+     "ActualAccQual"   string                   The actual memory accesses
+                                                performed by the kernel on the
+                                                kernel argument. Only present if
+                                                "ValueKind" is "GlobalBuffer",
+                                                "Image", or "Pipe". This may be
+                                                more restrictive than indicated
+                                                by "AccQual" to reflect what the
+                                                kernel actual does. If not
+                                                present then the runtime must
+                                                assume what is implied by
+                                                "AccQual" and "IsConst". Values
+                                                are:
+
+                                                - "ReadOnly"
+                                                - "WriteOnly"
+                                                - "ReadWrite"
+
+     "IsConst"         boolean                  Indicates if the kernel argument
+                                                is const qualified. Only present
+                                                if "ValueKind" is
+                                                "GlobalBuffer".
+
+     "IsRestrict"      boolean                  Indicates if the kernel argument
+                                                is restrict qualified. Only
+                                                present if "ValueKind" is
+                                                "GlobalBuffer".
+
+     "IsVolatile"      boolean                  Indicates if the kernel argument
+                                                is volatile qualified. Only
+                                                present if "ValueKind" is
+                                                "GlobalBuffer".
+
+     "IsPipe"          boolean                  Indicates if the kernel argument
+                                                is pipe qualified. Only present
+                                                if "ValueKind" is "Pipe".
+
+                                                .. TODO
+                                                   Can GlobalBuffer be pipe
+                                                   qualified?
+     ================= ============== ========= ================================
+
+..
+
+  .. table:: AMDHSA Code Object V2 Kernel Code Properties Metadata Map
+     :name: amdgpu-amdhsa-code-object-kernel-code-properties-metadata-map-table-v2
+
+     ============================ ============== ========= =====================
+     String Key                   Value Type     Required? Description
+     ============================ ============== ========= =====================
+     "KernargSegmentSize"         integer        Required  The size in bytes of
+                                                           the kernarg segment
+                                                           that holds the values
+                                                           of the arguments to
+                                                           the kernel.
+     "GroupSegmentFixedSize"      integer        Required  The amount of group
+                                                           segment memory
+                                                           required by a
+                                                           work-group in
+                                                           bytes. This does not
+                                                           include any
+                                                           dynamically allocated
+                                                           group segment memory
+                                                           that may be added
+                                                           when the kernel is
+                                                           dispatched.
+     "PrivateSegmentFixedSize"    integer        Required  The amount of fixed
+                                                           private address space
+                                                           memory required for a
+                                                           work-item in
+                                                           bytes. If the kernel
+                                                           uses a dynamic call
+                                                           stack then additional
+                                                           space must be added
+                                                           to this value for the
+                                                           call stack.
+     "KernargSegmentAlign"        integer        Required  The maximum byte
+                                                           alignment of
+                                                           arguments in the
+                                                           kernarg segment. Must
+                                                           be a power of 2.
+     "WavefrontSize"              integer        Required  Wavefront size. Must
+                                                           be a power of 2.
+     "NumSGPRs"                   integer        Required  Number of scalar
+                                                           registers used by a
+                                                           wavefront for
+                                                           GFX6-GFX10. This
+                                                           includes the special
+                                                           SGPRs for VCC, Flat
+                                                           Scratch (GFX7-GFX10)
+                                                           and XNACK (for
+                                                           GFX8-GFX10). It does
+                                                           not include the 16
+                                                           SGPR added if a trap
+                                                           handler is
+                                                           enabled. It is not
+                                                           rounded up to the
+                                                           allocation
+                                                           granularity.
+     "NumVGPRs"                   integer        Required  Number of vector
+                                                           registers used by
+                                                           each work-item for
+                                                           GFX6-GFX10
+     "MaxFlatWorkGroupSize"       integer        Required  Maximum flat
+                                                           work-group size
+                                                           supported by the
+                                                           kernel in work-items.
+                                                           Must be >=1 and
+                                                           consistent with
+                                                           ReqdWorkGroupSize if
+                                                           not 0, 0, 0.
+     "NumSpilledSGPRs"            integer                  Number of stores from
+                                                           a scalar register to
+                                                           a register allocator
+                                                           created spill
+                                                           location.
+     "NumSpilledVGPRs"            integer                  Number of stores from
+                                                           a vector register to
+                                                           a register allocator
+                                                           created spill
+                                                           location.
+     ============================ ============== ========= =====================
+
+.. _amdgpu-amdhsa-code-object-metadata-v3:
+
+Code Object V3 Metadata (-mattr=+code-object-v3)
+++++++++++++++++++++++++++++++++++++++++++++++++
+
+Code object V3 metadata is specified by the ``NT_AMDGPU_METADATA`` note record
+(see :ref:`amdgpu-note-records-v3`).
+
+The metadata is represented as Message Pack formatted binary data (see
+[MsgPack]_). The top level is a Message Pack map that includes the
+keys defined in table
+:ref:`amdgpu-amdhsa-code-object-metadata-map-table-v3` and referenced
+tables.
+
+Additional information can be added to the maps. To avoid conflicts,
+any key names should be prefixed by "*vendor-name*." where
+``vendor-name`` can be the the name of the vendor and specific vendor
+tool that generates the information. The prefix is abbreviated to
+simply "." when it appears within a map that has been added by the
+same *vendor-name*.
+
+  .. table:: AMDHSA Code Object V3 Metadata Map
+     :name: amdgpu-amdhsa-code-object-metadata-map-table-v3
+
+     ================= ============== ========= =======================================
+     String Key        Value Type     Required? Description
+     ================= ============== ========= =======================================
+     "amdhsa.version"  sequence of    Required  - The first integer is the major
+                       2 integers                 version. Currently 1.
+                                                - The second integer is the minor
+                                                  version. Currently 0.
+     "amdhsa.printf"   sequence of              Each string is encoded information
+                       strings                  about a printf function call. The
+                                                encoded information is organized as
+                                                fields separated by colon (':'):
+
+                                                ``ID:N:S[0]:S[1]:...:S[N-1]:FormatString``
+
+                                                where:
+
+                                                ``ID``
+                                                  A 32 bit integer as a unique id for
+                                                  each printf function call
+
+                                                ``N``
+                                                  A 32 bit integer equal to the number
+                                                  of arguments of printf function call
+                                                  minus 1
+
+                                                ``S[i]`` (where i = 0, 1, ... , N-1)
+                                                  32 bit integers for the size in bytes
+                                                  of the i-th FormatString argument of
+                                                  the printf function call
+
+                                                FormatString
+                                                  The format string passed to the
+                                                  printf function call.
+     "amdhsa.kernels"  sequence of    Required  Sequence of the maps for each
+                       map                      kernel in the code object. See
+                                                :ref:`amdgpu-amdhsa-code-object-kernel-metadata-map-table-v3`
+                                                for the definition of the keys included
+                                                in that map.
+     ================= ============== ========= =======================================
+
+..
+
+  .. table:: AMDHSA Code Object V3 Kernel Metadata Map
+     :name: amdgpu-amdhsa-code-object-kernel-metadata-map-table-v3
+
+     =================================== ============== ========= ================================
+     String Key                          Value Type     Required? Description
+     =================================== ============== ========= ================================
+     ".name"                             string         Required  Source name of the kernel.
+     ".symbol"                           string         Required  Name of the kernel
+                                                                  descriptor ELF symbol.
+     ".language"                         string                   Source language of the kernel.
+                                                                  Values include:
+
+                                                                  - "OpenCL C"
+                                                                  - "OpenCL C++"
+                                                                  - "HCC"
+                                                                  - "HIP"
+                                                                  - "OpenMP"
+                                                                  - "Assembler"
+
+     ".language_version"                 sequence of              - The first integer is the major
+                                         2 integers                 version.
+                                                                  - The second integer is the
+                                                                    minor version.
+     ".args"                             sequence of              Sequence of maps of the
+                                         map                      kernel arguments. See
+                                                                  :ref:`amdgpu-amdhsa-code-object-kernel-argument-metadata-map-table-v3`
+                                                                  for the definition of the keys
+                                                                  included in that map.
+     ".reqd_workgroup_size"              sequence of              If not 0, 0, 0 then all values
+                                         3 integers               must be >=1 and the dispatch
+                                                                  work-group size X, Y, Z must
+                                                                  correspond to the specified
+                                                                  values. Defaults to 0, 0, 0.
+
+                                                                  Corresponds to the OpenCL
+                                                                  ``reqd_work_group_size``
+                                                                  attribute.
+     ".workgroup_size_hint"              sequence of              The dispatch work-group size
+                                         3 integers               X, Y, Z is likely to be the
+                                                                  specified values.
+
+                                                                  Corresponds to the OpenCL
+                                                                  ``work_group_size_hint``
+                                                                  attribute.
+     ".vec_type_hint"                    string                   The name of a scalar or vector
+                                                                  type.
+
+                                                                  Corresponds to the OpenCL
+                                                                  ``vec_type_hint`` attribute.
+
+     ".device_enqueue_symbol"            string                   The external symbol name
+                                                                  associated with a kernel.
+                                                                  OpenCL runtime allocates a
+                                                                  global buffer for the symbol
+                                                                  and saves the kernel's address
+                                                                  to it, which is used for
+                                                                  device side enqueueing. Only
+                                                                  available for device side
+                                                                  enqueued kernels.
+     ".kernarg_segment_size"             integer        Required  The size in bytes of
+                                                                  the kernarg segment
+                                                                  that holds the values
+                                                                  of the arguments to
+                                                                  the kernel.
+     ".group_segment_fixed_size"         integer        Required  The amount of group
+                                                                  segment memory
+                                                                  required by a
+                                                                  work-group in
+                                                                  bytes. This does not
+                                                                  include any
+                                                                  dynamically allocated
+                                                                  group segment memory
+                                                                  that may be added
+                                                                  when the kernel is
+                                                                  dispatched.
+     ".private_segment_fixed_size"       integer        Required  The amount of fixed
+                                                                  private address space
+                                                                  memory required for a
+                                                                  work-item in
+                                                                  bytes. If the kernel
+                                                                  uses a dynamic call
+                                                                  stack then additional
+                                                                  space must be added
+                                                                  to this value for the
+                                                                  call stack.
+     ".kernarg_segment_align"            integer        Required  The maximum byte
+                                                                  alignment of
+                                                                  arguments in the
+                                                                  kernarg segment. Must
+                                                                  be a power of 2.
+     ".wavefront_size"                   integer        Required  Wavefront size. Must
+                                                                  be a power of 2.
+     ".sgpr_count"                       integer        Required  Number of scalar
+                                                                  registers required by a
+                                                                  wavefront for
+                                                                  GFX6-GFX9. A register
+                                                                  is required if it is
+                                                                  used explicitly, or
+                                                                  if a higher numbered
+                                                                  register is used
+                                                                  explicitly. This
+                                                                  includes the special
+                                                                  SGPRs for VCC, Flat
+                                                                  Scratch (GFX7-GFX9)
+                                                                  and XNACK (for
+                                                                  GFX8-GFX9). It does
+                                                                  not include the 16
+                                                                  SGPR added if a trap
+                                                                  handler is
+                                                                  enabled. It is not
+                                                                  rounded up to the
+                                                                  allocation
+                                                                  granularity.
+     ".vgpr_count"                       integer        Required  Number of vector
+                                                                  registers required by
+                                                                  each work-item for
+                                                                  GFX6-GFX9. A register
+                                                                  is required if it is
+                                                                  used explicitly, or
+                                                                  if a higher numbered
+                                                                  register is used
+                                                                  explicitly.
+     ".max_flat_workgroup_size"          integer        Required  Maximum flat
+                                                                  work-group size
+                                                                  supported by the
+                                                                  kernel in work-items.
+                                                                  Must be >=1 and
+                                                                  consistent with
+                                                                  ReqdWorkGroupSize if
+                                                                  not 0, 0, 0.
+     ".sgpr_spill_count"                 integer                  Number of stores from
+                                                                  a scalar register to
+                                                                  a register allocator
+                                                                  created spill
+                                                                  location.
+     ".vgpr_spill_count"                 integer                  Number of stores from
+                                                                  a vector register to
+                                                                  a register allocator
+                                                                  created spill
+                                                                  location.
+     =================================== ============== ========= ================================
+
+..
+
+  .. table:: AMDHSA Code Object V3 Kernel Argument Metadata Map
+     :name: amdgpu-amdhsa-code-object-kernel-argument-metadata-map-table-v3
+
+     ====================== ============== ========= ================================
+     String Key             Value Type     Required? Description
+     ====================== ============== ========= ================================
+     ".name"                string                   Kernel argument name.
+     ".type_name"           string                   Kernel argument type name.
+     ".size"                integer        Required  Kernel argument size in bytes.
+     ".offset"              integer        Required  Kernel argument offset in
+                                                     bytes. The offset must be a
+                                                     multiple of the alignment
+                                                     required by the argument.
+     ".value_kind"          string         Required  Kernel argument kind that
+                                                     specifies how to set up the
+                                                     corresponding argument.
+                                                     Values include:
+
+                                                     "by_value"
+                                                       The argument is copied
+                                                       directly into the kernarg.
+
+                                                     "global_buffer"
+                                                       A global address space pointer
+                                                       to the buffer data is passed
+                                                       in the kernarg.
+
+                                                     "dynamic_shared_pointer"
+                                                       A group address space pointer
+                                                       to dynamically allocated LDS
+                                                       is passed in the kernarg.
+
+                                                     "sampler"
+                                                       A global address space
+                                                       pointer to a S# is passed in
+                                                       the kernarg.
+
+                                                     "image"
+                                                       A global address space
+                                                       pointer to a T# is passed in
+                                                       the kernarg.
+
+                                                     "pipe"
+                                                       A global address space pointer
+                                                       to an OpenCL pipe is passed in
+                                                       the kernarg.
+
+                                                     "queue"
+                                                       A global address space pointer
+                                                       to an OpenCL device enqueue
+                                                       queue is passed in the
+                                                       kernarg.
+
+                                                     "hidden_global_offset_x"
+                                                       The OpenCL grid dispatch
+                                                       global offset for the X
+                                                       dimension is passed in the
+                                                       kernarg.
+
+                                                     "hidden_global_offset_y"
+                                                       The OpenCL grid dispatch
+                                                       global offset for the Y
+                                                       dimension is passed in the
+                                                       kernarg.
+
+                                                     "hidden_global_offset_z"
+                                                       The OpenCL grid dispatch
+                                                       global offset for the Z
+                                                       dimension is passed in the
+                                                       kernarg.
+
+                                                     "hidden_none"
+                                                       An argument that is not used
+                                                       by the kernel. Space needs to
+                                                       be left for it, but it does
+                                                       not need to be set up.
+
+                                                     "hidden_printf_buffer"
+                                                       A global address space pointer
+                                                       to the runtime printf buffer
+                                                       is passed in kernarg.
+
+                                                     "hidden_default_queue"
+                                                       A global address space pointer
+                                                       to the OpenCL device enqueue
+                                                       queue that should be used by
+                                                       the kernel by default is
+                                                       passed in the kernarg.
+
+                                                     "hidden_completion_action"
+                                                       A global address space pointer
+                                                       to help link enqueued kernels into
+                                                       the ancestor tree for determining
+                                                       when the parent kernel has finished.
+
+                                                     "hidden_multigrid_sync_arg"
+                                                       A global address space pointer for
+                                                       multi-grid synchronization is
+                                                       passed in the kernarg.
+
+     ".value_type"          string         Required  Kernel argument value type. Only
+                                                     present if ".value_kind" is
+                                                     "by_value". For vector data
+                                                     types, the value is for the
+                                                     element type. Values include:
+
+                                                     - "struct"
+                                                     - "i8"
+                                                     - "u8"
+                                                     - "i16"
+                                                     - "u16"
+                                                     - "f16"
+                                                     - "i32"
+                                                     - "u32"
+                                                     - "f32"
+                                                     - "i64"
+                                                     - "u64"
+                                                     - "f64"
+
+                                                     .. TODO
+                                                        How can it be determined if a
+                                                        vector type, and what size
+                                                        vector?
+     ".pointee_align"       integer                  Alignment in bytes of pointee
+                                                     type for pointer type kernel
+                                                     argument. Must be a power
+                                                     of 2. Only present if
+                                                     ".value_kind" is
+                                                     "dynamic_shared_pointer".
+     ".address_space"       string                   Kernel argument address space
+                                                     qualifier. Only present if
+                                                     ".value_kind" is "global_buffer" or
+                                                     "dynamic_shared_pointer". Values
+                                                     are:
+
+                                                     - "private"
+                                                     - "global"
+                                                     - "constant"
+                                                     - "local"
+                                                     - "generic"
+                                                     - "region"
+
+                                                     .. TODO
+                                                        Is "global_buffer" only "global"
+                                                        or "constant"? Is
+                                                        "dynamic_shared_pointer" always
+                                                        "local"? Can HCC allow "generic"?
+                                                        How can "private" or "region"
+                                                        ever happen?
+     ".access"              string                   Kernel argument access
+                                                     qualifier. Only present if
+                                                     ".value_kind" is "image" or
+                                                     "pipe". Values
+                                                     are:
+
+                                                     - "read_only"
+                                                     - "write_only"
+                                                     - "read_write"
+
+                                                     .. TODO
+                                                        Does this apply to
+                                                        "global_buffer"?
+     ".actual_access"       string                   The actual memory accesses
+                                                     performed by the kernel on the
+                                                     kernel argument. Only present if
+                                                     ".value_kind" is "global_buffer",
+                                                     "image", or "pipe". This may be
+                                                     more restrictive than indicated
+                                                     by ".access" to reflect what the
+                                                     kernel actual does. If not
+                                                     present then the runtime must
+                                                     assume what is implied by
+                                                     ".access" and ".is_const"      . Values
+                                                     are:
+
+                                                     - "read_only"
+                                                     - "write_only"
+                                                     - "read_write"
+
+     ".is_const"            boolean                  Indicates if the kernel argument
+                                                     is const qualified. Only present
+                                                     if ".value_kind" is
+                                                     "global_buffer".
+
+     ".is_restrict"         boolean                  Indicates if the kernel argument
+                                                     is restrict qualified. Only
+                                                     present if ".value_kind" is
+                                                     "global_buffer".
+
+     ".is_volatile"         boolean                  Indicates if the kernel argument
+                                                     is volatile qualified. Only
+                                                     present if ".value_kind" is
+                                                     "global_buffer".
+
+     ".is_pipe"             boolean                  Indicates if the kernel argument
+                                                     is pipe qualified. Only present
+                                                     if ".value_kind" is "pipe".
+
+                                                     .. TODO
+                                                        Can "global_buffer" be pipe
+                                                        qualified?
+     ====================== ============== ========= ================================
+
+..
+
+Kernel Dispatch
+~~~~~~~~~~~~~~~
+
+The HSA architected queuing language (AQL) defines a user space memory interface
+that can be used to control the dispatch of kernels, in an agent independent
+way. An agent can have zero or more AQL queues created for it using the ROCm
+runtime, in which AQL packets (all of which are 64 bytes) can be placed. See the
+*HSA Platform System Architecture Specification* [HSA]_ for the AQL queue
+mechanics and packet layouts.
+
+The packet processor of a kernel agent is responsible for detecting and
+dispatching HSA kernels from the AQL queues associated with it. For AMD GPUs the
+packet processor is implemented by the hardware command processor (CP),
+asynchronous dispatch controller (ADC) and shader processor input controller
+(SPI).
+
+The ROCm runtime can be used to allocate an AQL queue object. It uses the kernel
+mode driver to initialize and register the AQL queue with CP.
+
+To dispatch a kernel the following actions are performed. This can occur in the
+CPU host program, or from an HSA kernel executing on a GPU.
+
+1. A pointer to an AQL queue for the kernel agent on which the kernel is to be
+   executed is obtained.
+2. A pointer to the kernel descriptor (see
+   :ref:`amdgpu-amdhsa-kernel-descriptor`) of the kernel to execute is
+   obtained. It must be for a kernel that is contained in a code object that that
+   was loaded by the ROCm runtime on the kernel agent with which the AQL queue is
+   associated.
+3. Space is allocated for the kernel arguments using the ROCm runtime allocator
+   for a memory region with the kernarg property for the kernel agent that will
+   execute the kernel. It must be at least 16 byte aligned.
+4. Kernel argument values are assigned to the kernel argument memory
+   allocation. The layout is defined in the *HSA Programmer's Language Reference*
+   [HSA]_. For AMDGPU the kernel execution directly accesses the kernel argument
+   memory in the same way constant memory is accessed. (Note that the HSA
+   specification allows an implementation to copy the kernel argument contents to
+   another location that is accessed by the kernel.)
+5. An AQL kernel dispatch packet is created on the AQL queue. The ROCm runtime
+   api uses 64 bit atomic operations to reserve space in the AQL queue for the
+   packet. The packet must be set up, and the final write must use an atomic
+   store release to set the packet kind to ensure the packet contents are
+   visible to the kernel agent. AQL defines a doorbell signal mechanism to
+   notify the kernel agent that the AQL queue has been updated. These rules, and
+   the layout of the AQL queue and kernel dispatch packet is defined in the *HSA
+   System Architecture Specification* [HSA]_.
+6. A kernel dispatch packet includes information about the actual dispatch,
+   such as grid and work-group size, together with information from the code
+   object about the kernel, such as segment sizes. The ROCm runtime queries on
+   the kernel symbol can be used to obtain the code object values which are
+   recorded in the :ref:`amdgpu-amdhsa-code-object-metadata`.
+7. CP executes micro-code and is responsible for detecting and setting up the
+   GPU to execute the wavefronts of a kernel dispatch.
+8. CP ensures that when the a wavefront starts executing the kernel machine
+   code, the scalar general purpose registers (SGPR) and vector general purpose
+   registers (VGPR) are set up as required by the machine code. The required
+   setup is defined in the :ref:`amdgpu-amdhsa-kernel-descriptor`. The initial
+   register state is defined in
+   :ref:`amdgpu-amdhsa-initial-kernel-execution-state`.
+9. The prolog of the kernel machine code (see
+   :ref:`amdgpu-amdhsa-kernel-prolog`) sets up the machine state as necessary
+   before continuing executing the machine code that corresponds to the kernel.
+10. When the kernel dispatch has completed execution, CP signals the completion
+    signal specified in the kernel dispatch packet if not 0.
+
+.. _amdgpu-amdhsa-memory-spaces:
+
+Memory Spaces
+~~~~~~~~~~~~~
+
+The memory space properties are:
+
+  .. table:: AMDHSA Memory Spaces
+     :name: amdgpu-amdhsa-memory-spaces-table
+
+     ================= =========== ======== ======= ==================
+     Memory Space Name HSA Segment Hardware Address NULL Value
+                       Name        Name     Size
+     ================= =========== ======== ======= ==================
+     Private           private     scratch  32      0x00000000
+     Local             group       LDS      32      0xFFFFFFFF
+     Global            global      global   64      0x0000000000000000
+     Constant          constant    *same as 64      0x0000000000000000
+                                   global*
+     Generic           flat        flat     64      0x0000000000000000
+     Region            N/A         GDS      32      *not implemented
+                                                    for AMDHSA*
+     ================= =========== ======== ======= ==================
+
+The global and constant memory spaces both use global virtual addresses, which
+are the same virtual address space used by the CPU. However, some virtual
+addresses may only be accessible to the CPU, some only accessible by the GPU,
+and some by both.
+
+Using the constant memory space indicates that the data will not change during
+the execution of the kernel. This allows scalar read instructions to be
+used. The vector and scalar L1 caches are invalidated of volatile data before
+each kernel dispatch execution to allow constant memory to change values between
+kernel dispatches.
+
+The local memory space uses the hardware Local Data Store (LDS) which is
+automatically allocated when the hardware creates work-groups of wavefronts, and
+freed when all the wavefronts of a work-group have terminated. The data store
+(DS) instructions can be used to access it.
+
+The private memory space uses the hardware scratch memory support. If the kernel
+uses scratch, then the hardware allocates memory that is accessed using
+wavefront lane dword (4 byte) interleaving. The mapping used from private
+address to physical address is:
+
+  ``wavefront-scratch-base +
+  (private-address * wavefront-size * 4) +
+  (wavefront-lane-id * 4)``
+
+There are different ways that the wavefront scratch base address is determined
+by a wavefront (see :ref:`amdgpu-amdhsa-initial-kernel-execution-state`). This
+memory can be accessed in an interleaved manner using buffer instruction with
+the scratch buffer descriptor and per wavefront scratch offset, by the scratch
+instructions, or by flat instructions. If each lane of a wavefront accesses the
+same private address, the interleaving results in adjacent dwords being accessed
+and hence requires fewer cache lines to be fetched. Multi-dword access is not
+supported except by flat and scratch instructions in GFX9-GFX10.
+
+The generic address space uses the hardware flat address support available in
+GFX7-GFX10. This uses two fixed ranges of virtual addresses (the private and
+local appertures), that are outside the range of addressible global memory, to
+map from a flat address to a private or local address.
+
+FLAT instructions can take a flat address and access global, private (scratch)
+and group (LDS) memory depending in if the address is within one of the
+apperture ranges. Flat access to scratch requires hardware aperture setup and
+setup in the kernel prologue (see :ref:`amdgpu-amdhsa-flat-scratch`). Flat
+access to LDS requires hardware aperture setup and M0 (GFX7-GFX8) register setup
+(see :ref:`amdgpu-amdhsa-m0`).
+
+To convert between a segment address and a flat address the base address of the
+appertures address can be used. For GFX7-GFX8 these are available in the
+:ref:`amdgpu-amdhsa-hsa-aql-queue` the address of which can be obtained with
+Queue Ptr SGPR (see :ref:`amdgpu-amdhsa-initial-kernel-execution-state`). For
+GFX9-GFX10 the appature base addresses are directly available as inline constant
+registers ``SRC_SHARED_BASE/LIMIT`` and ``SRC_PRIVATE_BASE/LIMIT``. In 64 bit
+address mode the apperture sizes are 2^32 bytes and the base is aligned to 2^32
+which makes it easier to convert from flat to segment or segment to flat.
+
+Image and Samplers
+~~~~~~~~~~~~~~~~~~
+
+Image and sample handles created by the ROCm runtime are 64 bit addresses of a
+hardware 32 byte V# and 48 byte S# object respectively. In order to support the
+HSA ``query_sampler`` operations two extra dwords are used to store the HSA BRIG
+enumeration values for the queries that are not trivially deducible from the S#
+representation.
+
+HSA Signals
+~~~~~~~~~~~
+
+HSA signal handles created by the ROCm runtime are 64 bit addresses of a
+structure allocated in memory accessible from both the CPU and GPU. The
+structure is defined by the ROCm runtime and subject to change between releases
+(see [AMD-ROCm-github]_).
+
+.. _amdgpu-amdhsa-hsa-aql-queue:
+
+HSA AQL Queue
+~~~~~~~~~~~~~
+
+The HSA AQL queue structure is defined by the ROCm runtime and subject to change
+between releases (see [AMD-ROCm-github]_). For some processors it contains
+fields needed to implement certain language features such as the flat address
+aperture bases. It also contains fields used by CP such as managing the
+allocation of scratch memory.
+
+.. _amdgpu-amdhsa-kernel-descriptor:
+
+Kernel Descriptor
+~~~~~~~~~~~~~~~~~
+
+A kernel descriptor consists of the information needed by CP to initiate the
+execution of a kernel, including the entry point address of the machine code
+that implements the kernel.
+
+Kernel Descriptor for GFX6-GFX10
+++++++++++++++++++++++++++++++++
+
+CP microcode requires the Kernel descriptor to be allocated on 64 byte
+alignment.
+
+  .. table:: Kernel Descriptor for GFX6-GFX10
+     :name: amdgpu-amdhsa-kernel-descriptor-gfx6-gfx10-table
+
+     ======= ======= =============================== ============================
+     Bits    Size    Field Name                      Description
+     ======= ======= =============================== ============================
+     31:0    4 bytes GROUP_SEGMENT_FIXED_SIZE        The amount of fixed local
+                                                     address space memory
+                                                     required for a work-group
+                                                     in bytes. This does not
+                                                     include any dynamically
+                                                     allocated local address
+                                                     space memory that may be
+                                                     added when the kernel is
+                                                     dispatched.
+     63:32   4 bytes PRIVATE_SEGMENT_FIXED_SIZE      The amount of fixed
+                                                     private address space
+                                                     memory required for a
+                                                     work-item in bytes. If
+                                                     is_dynamic_callstack is 1
+                                                     then additional space must
+                                                     be added to this value for
+                                                     the call stack.
+     127:64  8 bytes                                 Reserved, must be 0.
+     191:128 8 bytes KERNEL_CODE_ENTRY_BYTE_OFFSET   Byte offset (possibly
+                                                     negative) from base
+                                                     address of kernel
+                                                     descriptor to kernel's
+                                                     entry point instruction
+                                                     which must be 256 byte
+                                                     aligned.
+     351:272 20                                      Reserved, must be 0.
+             bytes
+     383:352 4 bytes COMPUTE_PGM_RSRC3               GFX6-9
+                                                       Reserved, must be 0.
+                                                     GFX10
+                                                       Compute Shader (CS)
+                                                       program settings used by
+                                                       CP to set up
+                                                       ``COMPUTE_PGM_RSRC3``
+                                                       configuration
+                                                       register. See
+                                                       :ref:`amdgpu-amdhsa-compute_pgm_rsrc3-gfx10-table`.
+     415:384 4 bytes COMPUTE_PGM_RSRC1               Compute Shader (CS)
+                                                     program settings used by
+                                                     CP to set up
+                                                     ``COMPUTE_PGM_RSRC1``
+                                                     configuration
+                                                     register. See
+                                                     :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`.
+     447:416 4 bytes COMPUTE_PGM_RSRC2               Compute Shader (CS)
+                                                     program settings used by
+                                                     CP to set up
+                                                     ``COMPUTE_PGM_RSRC2``
+                                                     configuration
+                                                     register. See
+                                                     :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx10-table`.
+     448     1 bit   ENABLE_SGPR_PRIVATE_SEGMENT     Enable the setup of the
+                     _BUFFER                         SGPR user data registers
+                                                     (see
+                                                     :ref:`amdgpu-amdhsa-initial-kernel-execution-state`).
+
+                                                     The total number of SGPR
+                                                     user data registers
+                                                     requested must not exceed
+                                                     16 and match value in
+                                                     ``compute_pgm_rsrc2.user_sgpr.user_sgpr_count``.
+                                                     Any requests beyond 16
+                                                     will be ignored.
+     449     1 bit   ENABLE_SGPR_DISPATCH_PTR        *see above*
+     450     1 bit   ENABLE_SGPR_QUEUE_PTR           *see above*
+     451     1 bit   ENABLE_SGPR_KERNARG_SEGMENT_PTR *see above*
+     452     1 bit   ENABLE_SGPR_DISPATCH_ID         *see above*
+     453     1 bit   ENABLE_SGPR_FLAT_SCRATCH_INIT   *see above*
+     454     1 bit   ENABLE_SGPR_PRIVATE_SEGMENT     *see above*
+                     _SIZE
+     457:455 3 bits                                  Reserved, must be 0.
+     458     1 bit   ENABLE_WAVEFRONT_SIZE32         GFX6-9
+                                                       Reserved, must be 0.
+                                                     GFX10
+                                                       - If 0 execute in
+                                                         wavefront size 64 mode.
+                                                       - If 1 execute in
+                                                         native wavefront size
+                                                         32 mode.
+     463:459 5 bits                                  Reserved, must be 0.
+     511:464 6 bytes                                 Reserved, must be 0.
+     512     **Total size 64 bytes.**
+     ======= ====================================================================
+
+..
+
+  .. table:: compute_pgm_rsrc1 for GFX6-GFX10
+     :name: amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table
+
+     ======= ======= =============================== ===========================================================================
+     Bits    Size    Field Name                      Description
+     ======= ======= =============================== ===========================================================================
+     5:0     6 bits  GRANULATED_WORKITEM_VGPR_COUNT  Number of vector register
+                                                     blocks used by each work-item;
+                                                     granularity is device
+                                                     specific:
+
+                                                     GFX6-GFX9
+                                                       - vgprs_used 0..256
+                                                       - max(0, ceil(vgprs_used / 4) - 1)
+                                                     GFX10 (wavefront size 64)
+                                                       - max_vgpr 1..256
+                                                       - max(0, ceil(vgprs_used / 4) - 1)
+                                                     GFX10 (wavefront size 32)
+                                                       - max_vgpr 1..256
+                                                       - max(0, ceil(vgprs_used / 8) - 1)
+
+                                                     Where vgprs_used is defined
+                                                     as the highest VGPR number
+                                                     explicitly referenced plus
+                                                     one.
+
+                                                     Used by CP to set up
+                                                     ``COMPUTE_PGM_RSRC1.VGPRS``.
+
+                                                     The
+                                                     :ref:`amdgpu-assembler`
+                                                     calculates this
+                                                     automatically for the
+                                                     selected processor from
+                                                     values provided to the
+                                                     `.amdhsa_kernel` directive
+                                                     by the
+                                                     `.amdhsa_next_free_vgpr`
+                                                     nested directive (see
+                                                     :ref:`amdhsa-kernel-directives-table`).
+     9:6     4 bits  GRANULATED_WAVEFRONT_SGPR_COUNT Number of scalar register
+                                                     blocks used by a wavefront;
+                                                     granularity is device
+                                                     specific:
+
+                                                     GFX6-GFX8
+                                                       - sgprs_used 0..112
+                                                       - max(0, ceil(sgprs_used / 8) - 1)
+                                                     GFX9
+                                                       - sgprs_used 0..112
+                                                       - 2 * max(0, ceil(sgprs_used / 16) - 1)
+                                                     GFX10
+                                                       Reserved, must be 0.
+                                                       (128 SGPRs always
+                                                       allocated.)
+
+                                                     Where sgprs_used is
+                                                     defined as the highest
+                                                     SGPR number explicitly
+                                                     referenced plus one, plus
+                                                     a target-specific number
+                                                     of additional special
+                                                     SGPRs for VCC,
+                                                     FLAT_SCRATCH (GFX7+) and
+                                                     XNACK_MASK (GFX8+), and
+                                                     any additional
+                                                     target-specific
+                                                     limitations. It does not
+                                                     include the 16 SGPRs added
+                                                     if a trap handler is
+                                                     enabled.
+
+                                                     The target-specific
+                                                     limitations and special
+                                                     SGPR layout are defined in
+                                                     the hardware
+                                                     documentation, which can
+                                                     be found in the
+                                                     :ref:`amdgpu-processors`
+                                                     table.
+
+                                                     Used by CP to set up
+                                                     ``COMPUTE_PGM_RSRC1.SGPRS``.
+
+                                                     The
+                                                     :ref:`amdgpu-assembler`
+                                                     calculates this
+                                                     automatically for the
+                                                     selected processor from
+                                                     values provided to the
+                                                     `.amdhsa_kernel` directive
+                                                     by the
+                                                     `.amdhsa_next_free_sgpr`
+                                                     and `.amdhsa_reserve_*`
+                                                     nested directives (see
+                                                     :ref:`amdhsa-kernel-directives-table`).
+     11:10   2 bits  PRIORITY                        Must be 0.
+
+                                                     Start executing wavefront
+                                                     at the specified priority.
+
+                                                     CP is responsible for
+                                                     filling in
+                                                     ``COMPUTE_PGM_RSRC1.PRIORITY``.
+     13:12   2 bits  FLOAT_ROUND_MODE_32             Wavefront starts execution
+                                                     with specified rounding
+                                                     mode for single (32
+                                                     bit) floating point
+                                                     precision floating point
+                                                     operations.
+
+                                                     Floating point rounding
+                                                     mode values are defined in
+                                                     :ref:`amdgpu-amdhsa-floating-point-rounding-mode-enumeration-values-table`.
+
+                                                     Used by CP to set up
+                                                     ``COMPUTE_PGM_RSRC1.FLOAT_MODE``.
+     15:14   2 bits  FLOAT_ROUND_MODE_16_64          Wavefront starts execution
+                                                     with specified rounding
+                                                     denorm mode for half/double (16
+                                                     and 64 bit) floating point
+                                                     precision floating point
+                                                     operations.
+
+                                                     Floating point rounding
+                                                     mode values are defined in
+                                                     :ref:`amdgpu-amdhsa-floating-point-rounding-mode-enumeration-values-table`.
+
+                                                     Used by CP to set up
+                                                     ``COMPUTE_PGM_RSRC1.FLOAT_MODE``.
+     17:16   2 bits  FLOAT_DENORM_MODE_32            Wavefront starts execution
+                                                     with specified denorm mode
+                                                     for single (32
+                                                     bit)  floating point
+                                                     precision floating point
+                                                     operations.
+
+                                                     Floating point denorm mode
+                                                     values are defined in
+                                                     :ref:`amdgpu-amdhsa-floating-point-denorm-mode-enumeration-values-table`.
+
+                                                     Used by CP to set up
+                                                     ``COMPUTE_PGM_RSRC1.FLOAT_MODE``.
+     19:18   2 bits  FLOAT_DENORM_MODE_16_64         Wavefront starts execution
+                                                     with specified denorm mode
+                                                     for half/double (16
+                                                     and 64 bit) floating point
+                                                     precision floating point
+                                                     operations.
+
+                                                     Floating point denorm mode
+                                                     values are defined in
+                                                     :ref:`amdgpu-amdhsa-floating-point-denorm-mode-enumeration-values-table`.
+
+                                                     Used by CP to set up
+                                                     ``COMPUTE_PGM_RSRC1.FLOAT_MODE``.
+     20      1 bit   PRIV                            Must be 0.
+
+                                                     Start executing wavefront
+                                                     in privilege trap handler
+                                                     mode.
+
+                                                     CP is responsible for
+                                                     filling in
+                                                     ``COMPUTE_PGM_RSRC1.PRIV``.
+     21      1 bit   ENABLE_DX10_CLAMP               Wavefront starts execution
+                                                     with DX10 clamp mode
+                                                     enabled. Used by the vector
+                                                     ALU to force DX10 style
+                                                     treatment of NaN's (when
+                                                     set, clamp NaN to zero,
+                                                     otherwise pass NaN
+                                                     through).
+
+                                                     Used by CP to set up
+                                                     ``COMPUTE_PGM_RSRC1.DX10_CLAMP``.
+     22      1 bit   DEBUG_MODE                      Must be 0.
+
+                                                     Start executing wavefront
+                                                     in single step mode.
+
+                                                     CP is responsible for
+                                                     filling in
+                                                     ``COMPUTE_PGM_RSRC1.DEBUG_MODE``.
+     23      1 bit   ENABLE_IEEE_MODE                Wavefront starts execution
+                                                     with IEEE mode
+                                                     enabled. Floating point
+                                                     opcodes that support
+                                                     exception flag gathering
+                                                     will quiet and propagate
+                                                     signaling-NaN inputs per
+                                                     IEEE 754-2008. Min_dx10 and
+                                                     max_dx10 become IEEE
+                                                     754-2008 compliant due to
+                                                     signaling-NaN propagation
+                                                     and quieting.
+
+                                                     Used by CP to set up
+                                                     ``COMPUTE_PGM_RSRC1.IEEE_MODE``.
+     24      1 bit   BULKY                           Must be 0.
+
+                                                     Only one work-group allowed
+                                                     to execute on a compute
+                                                     unit.
+
+                                                     CP is responsible for
+                                                     filling in
+                                                     ``COMPUTE_PGM_RSRC1.BULKY``.
+     25      1 bit   CDBG_USER                       Must be 0.
+
+                                                     Flag that can be used to
+                                                     control debugging code.
+
+                                                     CP is responsible for
+                                                     filling in
+                                                     ``COMPUTE_PGM_RSRC1.CDBG_USER``.
+     26      1 bit   FP16_OVFL                       GFX6-GFX8
+                                                       Reserved, must be 0.
+                                                     GFX9-GFX10
+                                                       Wavefront starts execution
+                                                       with specified fp16 overflow
+                                                       mode.
+
+                                                       - If 0, fp16 overflow generates
+                                                         +/-INF values.
+                                                       - If 1, fp16 overflow that is the
+                                                         result of an +/-INF input value
+                                                         or divide by 0 produces a +/-INF,
+                                                         otherwise clamps computed
+                                                         overflow to +/-MAX_FP16 as
+                                                         appropriate.
+
+                                                       Used by CP to set up
+                                                       ``COMPUTE_PGM_RSRC1.FP16_OVFL``.
+     28:27   2 bits                                  Reserved, must be 0.
+     29      1 bit    WGP_MODE                       GFX6-GFX9
+                                                       Reserved, must be 0.
+                                                     GFX10
+                                                       - If 0 execute work-groups in
+                                                         CU wavefront execution mode.
+                                                       - If 1 execute work-groups on
+                                                         in WGP wavefront execution mode.
+
+                                                       See :ref:`amdgpu-amdhsa-memory-model`.
+
+                                                       Used by CP to set up
+                                                       ``COMPUTE_PGM_RSRC1.WGP_MODE``.
+     30      1 bit    MEM_ORDERED                    GFX6-9
+                                                       Reserved, must be 0.
+                                                     GFX10
+                                                       Controls the behavior of the
+                                                       waitcnt's vmcnt and vscnt
+                                                       counters.
+
+                                                       - If 0 vmcnt reports completion
+                                                         of load and atomic with return
+                                                         out of order with sample
+                                                         instructions, and the vscnt
+                                                         reports the completion of
+                                                         store and atomic without
+                                                         return in order.
+                                                       - If 1 vmcnt reports completion
+                                                         of load, atomic with return
+                                                         and sample instructions in
+                                                         order, and the vscnt reports
+                                                         the completion of store and
+                                                         atomic without return in order.
+
+                                                       Used by CP to set up
+                                                       ``COMPUTE_PGM_RSRC1.MEM_ORDERED``.
+     31      1 bit    FWD_PROGRESS                   GFX6-9
+                                                       Reserved, must be 0.
+                                                     GFX10
+                                                       - If 0 execute SIMD wavefronts
+                                                         using oldest first policy.
+                                                       - If 1 execute SIMD wavefronts to
+                                                         ensure wavefronts will make some
+                                                         forward progress.
+
+                                                       Used by CP to set up
+                                                       ``COMPUTE_PGM_RSRC1.FWD_PROGRESS``.
+     32      **Total size 4 bytes**
+     ======= ===================================================================================================================
+
+..
+
+  .. table:: compute_pgm_rsrc2 for GFX6-GFX10
+     :name: amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx10-table
+
+     ======= ======= =============================== ===========================================================================
+     Bits    Size    Field Name                      Description
+     ======= ======= =============================== ===========================================================================
+     0       1 bit   ENABLE_SGPR_PRIVATE_SEGMENT     Enable the setup of the
+                     _WAVEFRONT_OFFSET               SGPR wavefront scratch offset
+                                                     system register (see
+                                                     :ref:`amdgpu-amdhsa-initial-kernel-execution-state`).
+
+                                                     Used by CP to set up
+                                                     ``COMPUTE_PGM_RSRC2.SCRATCH_EN``.
+     5:1     5 bits  USER_SGPR_COUNT                 The total number of SGPR
+                                                     user data registers
+                                                     requested. This number must
+                                                     match the number of user
+                                                     data registers enabled.
+
+                                                     Used by CP to set up
+                                                     ``COMPUTE_PGM_RSRC2.USER_SGPR``.
+     6       1 bit   ENABLE_TRAP_HANDLER             Must be 0.
+
+                                                     This bit represents
+                                                     ``COMPUTE_PGM_RSRC2.TRAP_PRESENT``,
+                                                     which is set by the CP if
+                                                     the runtime has installed a
+                                                     trap handler.
+     7       1 bit   ENABLE_SGPR_WORKGROUP_ID_X      Enable the setup of the
+                                                     system SGPR register for
+                                                     the work-group id in the X
+                                                     dimension (see
+                                                     :ref:`amdgpu-amdhsa-initial-kernel-execution-state`).
+
+                                                     Used by CP to set up
+                                                     ``COMPUTE_PGM_RSRC2.TGID_X_EN``.
+     8       1 bit   ENABLE_SGPR_WORKGROUP_ID_Y      Enable the setup of the
+                                                     system SGPR register for
+                                                     the work-group id in the Y
+                                                     dimension (see
+                                                     :ref:`amdgpu-amdhsa-initial-kernel-execution-state`).
+
+                                                     Used by CP to set up
+                                                     ``COMPUTE_PGM_RSRC2.TGID_Y_EN``.
+     9       1 bit   ENABLE_SGPR_WORKGROUP_ID_Z      Enable the setup of the
+                                                     system SGPR register for
+                                                     the work-group id in the Z
+                                                     dimension (see
+                                                     :ref:`amdgpu-amdhsa-initial-kernel-execution-state`).
+
+                                                     Used by CP to set up
+                                                     ``COMPUTE_PGM_RSRC2.TGID_Z_EN``.
+     10      1 bit   ENABLE_SGPR_WORKGROUP_INFO      Enable the setup of the
+                                                     system SGPR register for
+                                                     work-group information (see
+                                                     :ref:`amdgpu-amdhsa-initial-kernel-execution-state`).
+
+                                                     Used by CP to set up
+                                                     ``COMPUTE_PGM_RSRC2.TGID_SIZE_EN``.
+     12:11   2 bits  ENABLE_VGPR_WORKITEM_ID         Enable the setup of the
+                                                     VGPR system registers used
+                                                     for the work-item ID.
+                                                     :ref:`amdgpu-amdhsa-system-vgpr-work-item-id-enumeration-values-table`
+                                                     defines the values.
+
+                                                     Used by CP to set up
+                                                     ``COMPUTE_PGM_RSRC2.TIDIG_CMP_CNT``.
+     13      1 bit   ENABLE_EXCEPTION_ADDRESS_WATCH  Must be 0.
+
+                                                     Wavefront starts execution
+                                                     with address watch
+                                                     exceptions enabled which
+                                                     are generated when L1 has
+                                                     witnessed a thread access
+                                                     an *address of
+                                                     interest*.
+
+                                                     CP is responsible for
+                                                     filling in the address
+                                                     watch bit in
+                                                     ``COMPUTE_PGM_RSRC2.EXCP_EN_MSB``
+                                                     according to what the
+                                                     runtime requests.
+     14      1 bit   ENABLE_EXCEPTION_MEMORY         Must be 0.
+
+                                                     Wavefront starts execution
+                                                     with memory violation
+                                                     exceptions exceptions
+                                                     enabled which are generated
+                                                     when a memory violation has
+                                                     occurred for this wavefront from
+                                                     L1 or LDS
+                                                     (write-to-read-only-memory,
+                                                     mis-aligned atomic, LDS
+                                                     address out of range,
+                                                     illegal address, etc.).
+
+                                                     CP sets the memory
+                                                     violation bit in
+                                                     ``COMPUTE_PGM_RSRC2.EXCP_EN_MSB``
+                                                     according to what the
+                                                     runtime requests.
+     23:15   9 bits  GRANULATED_LDS_SIZE             Must be 0.
+
+                                                     CP uses the rounded value
+                                                     from the dispatch packet,
+                                                     not this value, as the
+                                                     dispatch may contain
+                                                     dynamically allocated group
+                                                     segment memory. CP writes
+                                                     directly to
+                                                     ``COMPUTE_PGM_RSRC2.LDS_SIZE``.
+
+                                                     Amount of group segment
+                                                     (LDS) to allocate for each
+                                                     work-group. Granularity is
+                                                     device specific:
+
+                                                     GFX6:
+                                                       roundup(lds-size / (64 * 4))
+                                                     GFX7-GFX10:
+                                                       roundup(lds-size / (128 * 4))
+
+     24      1 bit   ENABLE_EXCEPTION_IEEE_754_FP    Wavefront starts execution
+                     _INVALID_OPERATION              with specified exceptions
+                                                     enabled.
+
+                                                     Used by CP to set up
+                                                     ``COMPUTE_PGM_RSRC2.EXCP_EN``
+                                                     (set from bits 0..6).
+
+                                                     IEEE 754 FP Invalid
+                                                     Operation
+     25      1 bit   ENABLE_EXCEPTION_FP_DENORMAL    FP Denormal one or more
+                     _SOURCE                         input operands is a
+                                                     denormal number
+     26      1 bit   ENABLE_EXCEPTION_IEEE_754_FP    IEEE 754 FP Division by
+                     _DIVISION_BY_ZERO               Zero
+     27      1 bit   ENABLE_EXCEPTION_IEEE_754_FP    IEEE 754 FP FP Overflow
+                     _OVERFLOW
+     28      1 bit   ENABLE_EXCEPTION_IEEE_754_FP    IEEE 754 FP Underflow
+                     _UNDERFLOW
+     29      1 bit   ENABLE_EXCEPTION_IEEE_754_FP    IEEE 754 FP Inexact
+                     _INEXACT
+     30      1 bit   ENABLE_EXCEPTION_INT_DIVIDE_BY  Integer Division by Zero
+                     _ZERO                           (rcp_iflag_f32 instruction
+                                                     only)
+     31      1 bit                                   Reserved, must be 0.
+     32      **Total size 4 bytes.**
+     ======= ===================================================================================================================
+
+..
+
+  .. table:: compute_pgm_rsrc3 for GFX10
+     :name: amdgpu-amdhsa-compute_pgm_rsrc3-gfx10-table
+
+     ======= ======= =============================== ===========================================================================
+     Bits    Size    Field Name                      Description
+     ======= ======= =============================== ===========================================================================
+     3:0     4 bits  SHARED_VGPR_COUNT               Number of shared VGPRs for wavefront size 64. Granularity 8. Value 0-120.
+                                                     compute_pgm_rsrc1.vgprs + shared_vgpr_cnt cannot exceed 64.
+     31:4    28                                      Reserved, must be 0.
+             bits
+     32      **Total size 4 bytes.**
+     ======= ===================================================================================================================
+
+..
+
+  .. table:: Floating Point Rounding Mode Enumeration Values
+     :name: amdgpu-amdhsa-floating-point-rounding-mode-enumeration-values-table
+
+     ====================================== ===== ==============================
+     Enumeration Name                       Value Description
+     ====================================== ===== ==============================
+     FLOAT_ROUND_MODE_NEAR_EVEN             0     Round Ties To Even
+     FLOAT_ROUND_MODE_PLUS_INFINITY         1     Round Toward +infinity
+     FLOAT_ROUND_MODE_MINUS_INFINITY        2     Round Toward -infinity
+     FLOAT_ROUND_MODE_ZERO                  3     Round Toward 0
+     ====================================== ===== ==============================
+
+..
+
+  .. table:: Floating Point Denorm Mode Enumeration Values
+     :name: amdgpu-amdhsa-floating-point-denorm-mode-enumeration-values-table
+
+     ====================================== ===== ==============================
+     Enumeration Name                       Value Description
+     ====================================== ===== ==============================
+     FLOAT_DENORM_MODE_FLUSH_SRC_DST        0     Flush Source and Destination
+                                                  Denorms
+     FLOAT_DENORM_MODE_FLUSH_DST            1     Flush Output Denorms
+     FLOAT_DENORM_MODE_FLUSH_SRC            2     Flush Source Denorms
+     FLOAT_DENORM_MODE_FLUSH_NONE           3     No Flush
+     ====================================== ===== ==============================
+
+..
+
+  .. table:: System VGPR Work-Item ID Enumeration Values
+     :name: amdgpu-amdhsa-system-vgpr-work-item-id-enumeration-values-table
+
+     ======================================== ===== ============================
+     Enumeration Name                         Value Description
+     ======================================== ===== ============================
+     SYSTEM_VGPR_WORKITEM_ID_X                0     Set work-item X dimension
+                                                    ID.
+     SYSTEM_VGPR_WORKITEM_ID_X_Y              1     Set work-item X and Y
+                                                    dimensions ID.
+     SYSTEM_VGPR_WORKITEM_ID_X_Y_Z            2     Set work-item X, Y and Z
+                                                    dimensions ID.
+     SYSTEM_VGPR_WORKITEM_ID_UNDEFINED        3     Undefined.
+     ======================================== ===== ============================
+
+.. _amdgpu-amdhsa-initial-kernel-execution-state:
+
+Initial Kernel Execution State
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This section defines the register state that will be set up by the packet
+processor prior to the start of execution of every wavefront. This is limited by
+the constraints of the hardware controllers of CP/ADC/SPI.
+
+The order of the SGPR registers is defined, but the compiler can specify which
+ones are actually setup in the kernel descriptor using the ``enable_sgpr_*`` bit
+fields (see :ref:`amdgpu-amdhsa-kernel-descriptor`). The register numbers used
+for enabled registers are dense starting at SGPR0: the first enabled register is
+SGPR0, the next enabled register is SGPR1 etc.; disabled registers do not have
+an SGPR number.
+
+The initial SGPRs comprise up to 16 User SRGPs that are set by CP and apply to
+all wavefronts of the grid. It is possible to specify more than 16 User SGPRs using
+the ``enable_sgpr_*`` bit fields, in which case only the first 16 are actually
+initialized. These are then immediately followed by the System SGPRs that are
+set up by ADC/SPI and can have different values for each wavefront of the grid
+dispatch.
+
+SGPR register initial state is defined in
+:ref:`amdgpu-amdhsa-sgpr-register-set-up-order-table`.
+
+  .. table:: SGPR Register Set Up Order
+     :name: amdgpu-amdhsa-sgpr-register-set-up-order-table
+
+     ========== ========================== ====== ==============================
+     SGPR Order Name                       Number Description
+                (kernel descriptor enable  of
+                field)                     SGPRs
+     ========== ========================== ====== ==============================
+     First      Private Segment Buffer     4      V# that can be used, together
+                (enable_sgpr_private              with Scratch Wavefront Offset
+                _segment_buffer)                  as an offset, to access the
+                                                  private memory space using a
+                                                  segment address.
+
+                                                  CP uses the value provided by
+                                                  the runtime.
+     then       Dispatch Ptr               2      64 bit address of AQL dispatch
+                (enable_sgpr_dispatch_ptr)        packet for kernel dispatch
+                                                  actually executing.
+     then       Queue Ptr                  2      64 bit address of amd_queue_t
+                (enable_sgpr_queue_ptr)           object for AQL queue on which
+                                                  the dispatch packet was
+                                                  queued.
+     then       Kernarg Segment Ptr        2      64 bit address of Kernarg
+                (enable_sgpr_kernarg              segment. This is directly
+                _segment_ptr)                     copied from the
+                                                  kernarg_address in the kernel
+                                                  dispatch packet.
+
+                                                  Having CP load it once avoids
+                                                  loading it at the beginning of
+                                                  every wavefront.
+     then       Dispatch Id                2      64 bit Dispatch ID of the
+                (enable_sgpr_dispatch_id)         dispatch packet being
+                                                  executed.
+     then       Flat Scratch Init          2      This is 2 SGPRs:
+                (enable_sgpr_flat_scratch
+                _init)                            GFX6
+                                                    Not supported.
+                                                  GFX7-GFX8
+                                                    The first SGPR is a 32 bit
+                                                    byte offset from
+                                                    ``SH_HIDDEN_PRIVATE_BASE_VIMID``
+                                                    to per SPI base of memory
+                                                    for scratch for the queue
+                                                    executing the kernel
+                                                    dispatch. CP obtains this
+                                                    from the runtime. (The
+                                                    Scratch Segment Buffer base
+                                                    address is
+                                                    ``SH_HIDDEN_PRIVATE_BASE_VIMID``
+                                                    plus this offset.) The value
+                                                    of Scratch Wavefront Offset must
+                                                    be added to this offset by
+                                                    the kernel machine code,
+                                                    right shifted by 8, and
+                                                    moved to the FLAT_SCRATCH_HI
+                                                    SGPR register.
+                                                    FLAT_SCRATCH_HI corresponds
+                                                    to SGPRn-4 on GFX7, and
+                                                    SGPRn-6 on GFX8 (where SGPRn
+                                                    is the highest numbered SGPR
+                                                    allocated to the wavefront).
+                                                    FLAT_SCRATCH_HI is
+                                                    multiplied by 256 (as it is
+                                                    in units of 256 bytes) and
+                                                    added to
+                                                    ``SH_HIDDEN_PRIVATE_BASE_VIMID``
+                                                    to calculate the per wavefront
+                                                    FLAT SCRATCH BASE in flat
+                                                    memory instructions that
+                                                    access the scratch
+                                                    apperture.
+
+                                                    The second SGPR is 32 bit
+                                                    byte size of a single
+                                                    work-item's scratch memory
+                                                    usage. CP obtains this from
+                                                    the runtime, and it is
+                                                    always a multiple of DWORD.
+                                                    CP checks that the value in
+                                                    the kernel dispatch packet
+                                                    Private Segment Byte Size is
+                                                    not larger, and requests the
+                                                    runtime to increase the
+                                                    queue's scratch size if
+                                                    necessary. The kernel code
+                                                    must move it to
+                                                    FLAT_SCRATCH_LO which is
+                                                    SGPRn-3 on GFX7 and SGPRn-5
+                                                    on GFX8. FLAT_SCRATCH_LO is
+                                                    used as the FLAT SCRATCH
+                                                    SIZE in flat memory
+                                                    instructions. Having CP load
+                                                    it once avoids loading it at
+                                                    the beginning of every
+                                                    wavefront.
+                                                  GFX9-GFX10
+                                                    This is the
+                                                    64 bit base address of the
+                                                    per SPI scratch backing
+                                                    memory managed by SPI for
+                                                    the queue executing the
+                                                    kernel dispatch. CP obtains
+                                                    this from the runtime (and
+                                                    divides it if there are
+                                                    multiple Shader Arrays each
+                                                    with its own SPI). The value
+                                                    of Scratch Wavefront Offset must
+                                                    be added by the kernel
+                                                    machine code and the result
+                                                    moved to the FLAT_SCRATCH
+                                                    SGPR which is SGPRn-6 and
+                                                    SGPRn-5. It is used as the
+                                                    FLAT SCRATCH BASE in flat
+                                                    memory instructions.
+     then       Private Segment Size       1      The 32 bit byte size of a
+                                                  (enable_sgpr_private single
+                                                  work-item's
+                                                  scratch_segment_size) memory
+                                                  allocation. This is the
+                                                  value from the kernel
+                                                  dispatch packet Private
+                                                  Segment Byte Size rounded up
+                                                  by CP to a multiple of
+                                                  DWORD.
+
+                                                  Having CP load it once avoids
+                                                  loading it at the beginning of
+                                                  every wavefront.
+
+                                                  This is not used for
+                                                  GFX7-GFX8 since it is the same
+                                                  value as the second SGPR of
+                                                  Flat Scratch Init. However, it
+                                                  may be needed for GFX9-GFX10 which
+                                                  changes the meaning of the
+                                                  Flat Scratch Init value.
+     then       Grid Work-Group Count X    1      32 bit count of the number of
+                (enable_sgpr_grid                 work-groups in the X dimension
+                _workgroup_count_X)               for the grid being
+                                                  executed. Computed from the
+                                                  fields in the kernel dispatch
+                                                  packet as ((grid_size.x +
+                                                  workgroup_size.x - 1) /
+                                                  workgroup_size.x).
+     then       Grid Work-Group Count Y    1      32 bit count of the number of
+                (enable_sgpr_grid                 work-groups in the Y dimension
+                _workgroup_count_Y &&             for the grid being
+                less than 16 previous             executed. Computed from the
+                SGPRs)                            fields in the kernel dispatch
+                                                  packet as ((grid_size.y +
+                                                  workgroup_size.y - 1) /
+                                                  workgroupSize.y).
+
+                                                  Only initialized if <16
+                                                  previous SGPRs initialized.
+     then       Grid Work-Group Count Z    1      32 bit count of the number of
+                (enable_sgpr_grid                 work-groups in the Z dimension
+                _workgroup_count_Z &&             for the grid being
+                less than 16 previous             executed. Computed from the
+                SGPRs)                            fields in the kernel dispatch
+                                                  packet as ((grid_size.z +
+                                                  workgroup_size.z - 1) /
+                                                  workgroupSize.z).
+
+                                                  Only initialized if <16
+                                                  previous SGPRs initialized.
+     then       Work-Group Id X            1      32 bit work-group id in X
+                (enable_sgpr_workgroup_id         dimension of grid for
+                _X)                               wavefront.
+     then       Work-Group Id Y            1      32 bit work-group id in Y
+                (enable_sgpr_workgroup_id         dimension of grid for
+                _Y)                               wavefront.
+     then       Work-Group Id Z            1      32 bit work-group id in Z
+                (enable_sgpr_workgroup_id         dimension of grid for
+                _Z)                               wavefront.
+     then       Work-Group Info            1      {first_wavefront, 14'b0000,
+                (enable_sgpr_workgroup            ordered_append_term[10:0],
+                _info)                            threadgroup_size_in_wavefronts[5:0]}
+     then       Scratch Wavefront Offset   1      32 bit byte offset from base
+                (enable_sgpr_private              of scratch base of queue
+                _segment_wavefront_offset)        executing the kernel
+                                                  dispatch. Must be used as an
+                                                  offset with Private
+                                                  segment address when using
+                                                  Scratch Segment Buffer. It
+                                                  must be used to set up FLAT
+                                                  SCRATCH for flat addressing
+                                                  (see
+                                                  :ref:`amdgpu-amdhsa-flat-scratch`).
+     ========== ========================== ====== ==============================
+
+The order of the VGPR registers is defined, but the compiler can specify which
+ones are actually setup in the kernel descriptor using the ``enable_vgpr*`` bit
+fields (see :ref:`amdgpu-amdhsa-kernel-descriptor`). The register numbers used
+for enabled registers are dense starting at VGPR0: the first enabled register is
+VGPR0, the next enabled register is VGPR1 etc.; disabled registers do not have a
+VGPR number.
+
+VGPR register initial state is defined in
+:ref:`amdgpu-amdhsa-vgpr-register-set-up-order-table`.
+
+  .. table:: VGPR Register Set Up Order
+     :name: amdgpu-amdhsa-vgpr-register-set-up-order-table
+
+     ========== ========================== ====== ==============================
+     VGPR Order Name                       Number Description
+                (kernel descriptor enable  of
+                field)                     VGPRs
+     ========== ========================== ====== ==============================
+     First      Work-Item Id X             1      32 bit work item id in X
+                (Always initialized)              dimension of work-group for
+                                                  wavefront lane.
+     then       Work-Item Id Y             1      32 bit work item id in Y
+                (enable_vgpr_workitem_id          dimension of work-group for
+                > 0)                              wavefront lane.
+     then       Work-Item Id Z             1      32 bit work item id in Z
+                (enable_vgpr_workitem_id          dimension of work-group for
+                > 1)                              wavefront lane.
+     ========== ========================== ====== ==============================
+
+The setting of registers is done by GPU CP/ADC/SPI hardware as follows:
+
+1. SGPRs before the Work-Group Ids are set by CP using the 16 User Data
+   registers.
+2. Work-group Id registers X, Y, Z are set by ADC which supports any
+   combination including none.
+3. Scratch Wavefront Offset is set by SPI in a per wavefront basis which is why
+   its value cannot included with the flat scratch init value which is per queue.
+4. The VGPRs are set by SPI which only supports specifying either (X), (X, Y)
+   or (X, Y, Z).
+
+Flat Scratch register pair are adjacent SGRRs so they can be moved as a 64 bit
+value to the hardware required SGPRn-3 and SGPRn-4 respectively.
+
+The global segment can be accessed either using buffer instructions (GFX6 which
+has V# 64 bit address support), flat instructions (GFX7-GFX10), or global
+instructions (GFX9-GFX10).
+
+If buffer operations are used then the compiler can generate a V# with the
+following properties:
+
+* base address of 0
+* no swizzle
+* ATC: 1 if IOMMU present (such as APU)
+* ptr64: 1
+* MTYPE set to support memory coherence that matches the runtime (such as CC for
+  APU and NC for dGPU).
+
+.. _amdgpu-amdhsa-kernel-prolog:
+
+Kernel Prolog
+~~~~~~~~~~~~~
+
+.. _amdgpu-amdhsa-m0:
+
+M0
+++
+
+GFX6-GFX8
+  The M0 register must be initialized with a value at least the total LDS size
+  if the kernel may access LDS via DS or flat operations. Total LDS size is
+  available in dispatch packet. For M0, it is also possible to use maximum
+  possible value of LDS for given target (0x7FFF for GFX6 and 0xFFFF for
+  GFX7-GFX8).
+GFX9-GFX10
+  The M0 register is not used for range checking LDS accesses and so does not
+  need to be initialized in the prolog.
+
+.. _amdgpu-amdhsa-flat-scratch:
+
+Flat Scratch
+++++++++++++
+
+If the kernel may use flat operations to access scratch memory, the prolog code
+must set up FLAT_SCRATCH register pair (FLAT_SCRATCH_LO/FLAT_SCRATCH_HI which
+are in SGPRn-4/SGPRn-3). Initialization uses Flat Scratch Init and Scratch Wavefront
+Offset SGPR registers (see :ref:`amdgpu-amdhsa-initial-kernel-execution-state`):
+
+GFX6
+  Flat scratch is not supported.
+
+GFX7-GFX8
+  1. The low word of Flat Scratch Init is 32 bit byte offset from
+     ``SH_HIDDEN_PRIVATE_BASE_VIMID`` to the base of scratch backing memory
+     being managed by SPI for the queue executing the kernel dispatch. This is
+     the same value used in the Scratch Segment Buffer V# base address. The
+     prolog must add the value of Scratch Wavefront Offset to get the wavefront's byte
+     scratch backing memory offset from ``SH_HIDDEN_PRIVATE_BASE_VIMID``. Since
+     FLAT_SCRATCH_LO is in units of 256 bytes, the offset must be right shifted
+     by 8 before moving into FLAT_SCRATCH_LO.
+  2. The second word of Flat Scratch Init is 32 bit byte size of a single
+     work-items scratch memory usage. This is directly loaded from the kernel
+     dispatch packet Private Segment Byte Size and rounded up to a multiple of
+     DWORD. Having CP load it once avoids loading it at the beginning of every
+     wavefront. The prolog must move it to FLAT_SCRATCH_LO for use as FLAT SCRATCH
+     SIZE.
+
+GFX9-GFX10
+  The Flat Scratch Init is the 64 bit address of the base of scratch backing
+  memory being managed by SPI for the queue executing the kernel dispatch. The
+  prolog must add the value of Scratch Wavefront Offset and moved to the FLAT_SCRATCH
+  pair for use as the flat scratch base in flat memory instructions.
+
+.. _amdgpu-amdhsa-memory-model:
+
+Memory Model
+~~~~~~~~~~~~
+
+This section describes the mapping of LLVM memory model onto AMDGPU machine code
+(see :ref:`memmodel`). *The implementation is WIP.*
+
+.. TODO
+   Update when implementation complete.
+
+The AMDGPU backend supports the memory synchronization scopes specified in
+:ref:`amdgpu-memory-scopes`.
+
+The code sequences used to implement the memory model are defined in table
+:ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx6-gfx10-table`.
+
+The sequences specify the order of instructions that a single thread must
+execute. The ``s_waitcnt`` and ``buffer_wbinvl1_vol`` are defined with respect
+to other memory instructions executed by the same thread. This allows them to be
+moved earlier or later which can allow them to be combined with other instances
+of the same instruction, or hoisted/sunk out of loops to improve
+performance. Only the instructions related to the memory model are given;
+additional ``s_waitcnt`` instructions are required to ensure registers are
+defined before being used. These may be able to be combined with the memory
+model ``s_waitcnt`` instructions as described above.
+
+The AMDGPU backend supports the following memory models:
+
+  HSA Memory Model [HSA]_
+    The HSA memory model uses a single happens-before relation for all address
+    spaces (see :ref:`amdgpu-address-spaces`).
+  OpenCL Memory Model [OpenCL]_
+    The OpenCL memory model which has separate happens-before relations for the
+    global and local address spaces. Only a fence specifying both global and
+    local address space, and seq_cst instructions join the relationships. Since
+    the LLVM ``memfence`` instruction does not allow an address space to be
+    specified the OpenCL fence has to convervatively assume both local and
+    global address space was specified. However, optimizations can often be
+    done to eliminate the additional ``s_waitcnt`` instructions when there are
+    no intervening memory instructions which access the corresponding address
+    space. The code sequences in the table indicate what can be omitted for the
+    OpenCL memory. The target triple environment is used to determine if the
+    source language is OpenCL (see :ref:`amdgpu-opencl`).
+
+``ds/flat_load/store/atomic`` instructions to local memory are termed LDS
+operations.
+
+``buffer/global/flat_load/store/atomic`` instructions to global memory are
+termed vector memory operations.
+
+For GFX6-GFX9:
+
+* Each agent has multiple shader arrays (SA).
+* Each SA has multiple compute units (CU).
+* Each CU has multiple SIMDs that execute wavefronts.
+* The wavefronts for a single work-group are executed in the same CU but may be
+  executed by different SIMDs.
+* Each CU has a single LDS memory shared by the wavefronts of the work-groups
+  executing on it.
+* All LDS operations of a CU are performed as wavefront wide operations in a
+  global order and involve no caching. Completion is reported to a wavefront in
+  execution order.
+* The LDS memory has multiple request queues shared by the SIMDs of a
+  CU. Therefore, the LDS operations performed by different wavefronts of a work-group
+  can be reordered relative to each other, which can result in reordering the
+  visibility of vector memory operations with respect to LDS operations of other
+  wavefronts in the same work-group. A ``s_waitcnt lgkmcnt(0)`` is required to
+  ensure synchronization between LDS operations and vector memory operations
+  between wavefronts of a work-group, but not between operations performed by the
+  same wavefront.
+* The vector memory operations are performed as wavefront wide operations and
+  completion is reported to a wavefront in execution order. The exception is
+  that for GFX7-GFX9 ``flat_load/store/atomic`` instructions can report out of
+  vector memory order if they access LDS memory, and out of LDS operation order
+  if they access global memory.
+* The vector memory operations access a single vector L1 cache shared by all
+  SIMDs a CU. Therefore, no special action is required for coherence between the
+  lanes of a single wavefront, or for coherence between wavefronts in the same
+  work-group. A ``buffer_wbinvl1_vol`` is required for coherence between wavefronts
+  executing in different work-groups as they may be executing on different CUs.
+* The scalar memory operations access a scalar L1 cache shared by all wavefronts
+  on a group of CUs. The scalar and vector L1 caches are not coherent. However,
+  scalar operations are used in a restricted way so do not impact the memory
+  model. See :ref:`amdgpu-amdhsa-memory-spaces`.
+* The vector and scalar memory operations use an L2 cache shared by all CUs on
+  the same agent.
+* The L2 cache has independent channels to service disjoint ranges of virtual
+  addresses.
+* Each CU has a separate request queue per channel. Therefore, the vector and
+  scalar memory operations performed by wavefronts executing in different work-groups
+  (which may be executing on different CUs) of an agent can be reordered
+  relative to each other. A ``s_waitcnt vmcnt(0)`` is required to ensure
+  synchronization between vector memory operations of different CUs. It ensures a
+  previous vector memory operation has completed before executing a subsequent
+  vector memory or LDS operation and so can be used to meet the requirements of
+  acquire and release.
+* The L2 cache can be kept coherent with other agents on some targets, or ranges
+  of virtual addresses can be set up to bypass it to ensure system coherence.
+
+For GFX10:
+
+* Each agent has multiple shader arrays (SA).
+* Each SA has multiple work-group processors (WGP).
+* Each WGP has multiple compute units (CU).
+* Each CU has multiple SIMDs that execute wavefronts.
+* The wavefronts for a single work-group are executed in the same
+  WGP. In CU wavefront execution mode the wavefronts may be executed by
+  different SIMDs in the same CU. In WGP wavefront execution mode the
+  wavefronts may be executed by different SIMDs in different CUs in the same
+  WGP.
+* Each WGP has a single LDS memory shared by the wavefronts of the work-groups
+  executing on it.
+* All LDS operations of a WGP are performed as wavefront wide operations in a
+  global order and involve no caching. Completion is reported to a wavefront in
+  execution order.
+* The LDS memory has multiple request queues shared by the SIMDs of a
+  WGP. Therefore, the LDS operations performed by different wavefronts of a work-group
+  can be reordered relative to each other, which can result in reordering the
+  visibility of vector memory operations with respect to LDS operations of other
+  wavefronts in the same work-group. A ``s_waitcnt lgkmcnt(0)`` is required to
+  ensure synchronization between LDS operations and vector memory operations
+  between wavefronts of a work-group, but not between operations performed by the
+  same wavefront.
+* The vector memory operations are performed as wavefront wide operations.
+  Completion of load/store/sample operations are reported to a wavefront in
+  execution order of other load/store/sample operations performed by that
+  wavefront.
+* The vector memory operations access a vector L0 cache. There is a single L0
+  cache per CU. Each SIMD of a CU accesses the same L0 cache.
+  Therefore, no special action is required for coherence between the lanes of a
+  single wavefront. However, a ``BUFFER_GL0_INV`` is required for coherence
+  between wavefronts executing in the same work-group as they may be executing on
+  SIMDs of different CUs that access different L0s. A ``BUFFER_GL0_INV`` is also
+  required for coherence between wavefronts executing in different work-groups as
+  they may be executing on different WGPs.
+* The scalar memory operations access a scalar L0 cache shared by all wavefronts
+  on a WGP. The scalar and vector L0 caches are not coherent. However, scalar
+  operations are used in a restricted way so do not impact the memory model. See
+  :ref:`amdgpu-amdhsa-memory-spaces`.
+* The vector and scalar memory L0 caches use an L1 cache shared by all WGPs on
+  the same SA. Therefore, no special action is required for coherence between
+  the wavefronts of a single work-group. However, a ``BUFFER_GL1_INV`` is
+  required for coherence between wavefronts executing in different work-groups as
+  they may be executing on different SAs that access different L1s.
+* The L1 caches have independent quadrants to service disjoint ranges of virtual
+  addresses.
+* Each L0 cache has a separate request queue per L1 quadrant. Therefore, the
+  vector and scalar memory operations performed by different wavefronts, whether
+  executing in the same or different work-groups (which may be executing on
+  different CUs accessing different L0s), can be reordered relative to each
+  other. A ``s_waitcnt vmcnt(0) & vscnt(0)`` is required to ensure synchronization
+  between vector memory operations of different wavefronts. It ensures a previous
+  vector memory operation has completed before executing a subsequent vector
+  memory or LDS operation and so can be used to meet the requirements of acquire,
+  release and sequential consistency.
+* The L1 caches use an L2 cache shared by all SAs on the same agent.
+* The L2 cache has independent channels to service disjoint ranges of virtual
+  addresses.
+* Each L1 quadrant of a single SA accesses a different L2 channel. Each L1
+  quadrant has a separate request queue per L2 channel. Therefore, the vector
+  and scalar memory operations performed by wavefronts executing in different
+  work-groups (which may be executing on different SAs) of an agent can be
+  reordered relative to each other. A ``s_waitcnt vmcnt(0) & vscnt(0)`` is
+  required to ensure synchronization between vector memory operations of
+  different SAs. It ensures a previous vector memory operation has completed
+  before executing a subsequent vector memory and so can be used to meet the
+  requirements of acquire, release and sequential consistency.
+* The L2 cache can be kept coherent with other agents on some targets, or ranges
+  of virtual addresses can be set up to bypass it to ensure system coherence.
+
+Private address space uses ``buffer_load/store`` using the scratch V# (GFX6-GFX8),
+or ``scratch_load/store`` (GFX9-GFX10). Since only a single thread is accessing the
+memory, atomic memory orderings are not meaningful and all accesses are treated
+as non-atomic.
+
+Constant address space uses ``buffer/global_load`` instructions (or equivalent
+scalar memory instructions). Since the constant address space contents do not
+change during the execution of a kernel dispatch it is not legal to perform
+stores, and atomic memory orderings are not meaningful and all access are
+treated as non-atomic.
+
+A memory synchronization scope wider than work-group is not meaningful for the
+group (LDS) address space and is treated as work-group.
+
+The memory model does not support the region address space which is treated as
+non-atomic.
+
+Acquire memory ordering is not meaningful on store atomic instructions and is
+treated as non-atomic.
+
+Release memory ordering is not meaningful on load atomic instructions and is
+treated a non-atomic.
+
+Acquire-release memory ordering is not meaningful on load or store atomic
+instructions and is treated as acquire and release respectively.
+
+AMDGPU backend only uses scalar memory operations to access memory that is
+proven to not change during the execution of the kernel dispatch. This includes
+constant address space and global address space for program scope const
+variables. Therefore the kernel machine code does not have to maintain the
+scalar L1 cache to ensure it is coherent with the vector L1 cache. The scalar
+and vector L1 caches are invalidated between kernel dispatches by CP since
+constant address space data may change between kernel dispatch executions. See
+:ref:`amdgpu-amdhsa-memory-spaces`.
+
+The one execption is if scalar writes are used to spill SGPR registers. In this
+case the AMDGPU backend ensures the memory location used to spill is never
+accessed by vector memory operations at the same time. If scalar writes are used
+then a ``s_dcache_wb`` is inserted before the ``s_endpgm`` and before a function
+return since the locations may be used for vector memory instructions by a
+future wavefront that uses the same scratch area, or a function call that creates a
+frame at the same address, respectively. There is no need for a ``s_dcache_inv``
+as all scalar writes are write-before-read in the same thread.
+
+For GFX6-GFX9, scratch backing memory (which is used for the private address space)
+is accessed with MTYPE NC_NV (non-coherenent non-volatile). Since the private
+address space is only accessed by a single thread, and is always
+write-before-read, there is never a need to invalidate these entries from the L1
+cache. Hence all cache invalidates are done as ``*_vol`` to only invalidate the
+volatile cache lines.
+
+For GFX10, scratch backing memory (which is used for the private address space)
+is accessed with MTYPE NC (non-coherenent). Since the private address space is
+only accessed by a single thread, and is always write-before-read, there is
+never a need to invalidate these entries from the L0 or L1 caches.
+
+For GFX10, wavefronts are executed in native mode with in-order reporting of loads
+and sample instructions. In this mode vmcnt reports completion of load, atomic
+with return and sample instructions in order, and the vscnt reports the
+completion of store and atomic without return in order. See ``MEM_ORDERED`` field
+in :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`.
+
+In GFX10, wavefronts can be executed in WGP or CU wavefront execution mode:
+
+* In WGP wavefront execution mode the wavefronts of a work-group are executed
+  on the SIMDs of both CUs of the WGP. Therefore, explicit management of the per
+  CU L0 caches is required for work-group synchronization. Also accesses to L1 at
+  work-group scope need to be expicitly ordered as the accesses from different
+  CUs are not ordered.
+* In CU wavefront execution mode the wavefronts of a work-group are executed on
+  the SIMDs of a single CU of the WGP. Therefore, all global memory access by
+  the work-group access the same L0 which in turn ensures L1 accesses are
+  ordered and so do not require explicit management of the caches for
+  work-group synchronization.
+
+See ``WGP_MODE`` field in :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`
+and :ref:`amdgpu-target-features`.
+
+On dGPU the kernarg backing memory is accessed as UC (uncached) to avoid needing
+to invalidate the L2 cache. For GFX6-GFX9, this also causes it to be treated as
+non-volatile and so is not invalidated by ``*_vol``. On APU it is accessed as CC
+(cache coherent) and so the L2 cache will be coherent with the CPU and other
+agents.
+
+  .. table:: AMDHSA Memory Model Code Sequences GFX6-GFX10
+     :name: amdgpu-amdhsa-memory-model-code-sequences-gfx6-gfx10-table
+
+     ============ ============ ============== ========== =============================== ==================================
+     LLVM Instr   LLVM Memory  LLVM Memory    AMDGPU     AMDGPU Machine Code             AMDGPU Machine Code
+                  Ordering     Sync Scope     Address    GFX6-9                          GFX10
+                                              Space
+     ============ ============ ============== ========== =============================== ==================================
+     **Non-Atomic**
+     ----------------------------------------------------------------------------------------------------------------------
+     load         *none*       *none*         - global   - !volatile & !nontemporal      - !volatile & !nontemporal
+                                              - generic
+                                              - private    1. buffer/global/flat_load      1. buffer/global/flat_load
+                                              - constant
+                                                         - volatile & !nontemporal       - volatile & !nontemporal
+
+                                                           1. buffer/global/flat_load      1. buffer/global/flat_load
+                                                              glc=1                           glc=1 dlc=1
+
+                                                         - nontemporal                   - nontemporal
+
+                                                           1. buffer/global/flat_load      1. buffer/global/flat_load
+                                                              glc=1 slc=1                     slc=1
+
+     load         *none*       *none*         - local    1. ds_load                      1. ds_load
+     store        *none*       *none*         - global   - !nontemporal                  - !nontemporal
+                                              - generic
+                                              - private    1. buffer/global/flat_store     1. buffer/global/flat_store
+                                              - constant
+                                                         - nontemporal                   - nontemporal
+
+                                                           1. buffer/global/flat_stote      1. buffer/global/flat_store
+                                                              glc=1 slc=1                      slc=1
+
+     store        *none*       *none*         - local    1. ds_store                     1. ds_store
+     **Unordered Atomic**
+     ----------------------------------------------------------------------------------------------------------------------
+     load atomic  unordered    *any*          *any*      *Same as non-atomic*.           *Same as non-atomic*.
+     store atomic unordered    *any*          *any*      *Same as non-atomic*.           *Same as non-atomic*.
+     atomicrmw    unordered    *any*          *any*      *Same as monotonic              *Same as monotonic
+                                                         atomic*.                        atomic*.
+     **Monotonic Atomic**
+     ----------------------------------------------------------------------------------------------------------------------
+     load atomic  monotonic    - singlethread - global   1. buffer/global/flat_load      1. buffer/global/flat_load
+                               - wavefront    - generic
+     load atomic  monotonic    - workgroup    - global   1. buffer/global/flat_load      1. buffer/global/flat_load
+                                              - generic                                     glc=1
+
+                                                                                           - If CU wavefront execution mode, omit glc=1.
+
+     load atomic  monotonic    - singlethread - local    1. ds_load                      1. ds_load
+                               - wavefront
+                               - workgroup
+     load atomic  monotonic    - agent        - global   1. buffer/global/flat_load      1. buffer/global/flat_load
+                               - system       - generic     glc=1                           glc=1 dlc=1
+     store atomic monotonic    - singlethread - global   1. buffer/global/flat_store     1. buffer/global/flat_store
+                               - wavefront    - generic
+                               - workgroup
+                               - agent
+                               - system
+     store atomic monotonic    - singlethread - local    1. ds_store                     1. ds_store
+                               - wavefront
+                               - workgroup
+     atomicrmw    monotonic    - singlethread - global   1. buffer/global/flat_atomic    1. buffer/global/flat_atomic
+                               - wavefront    - generic
+                               - workgroup
+                               - agent
+                               - system
+     atomicrmw    monotonic    - singlethread - local    1. ds_atomic                    1. ds_atomic
+                               - wavefront
+                               - workgroup
+     **Acquire Atomic**
+     ----------------------------------------------------------------------------------------------------------------------
+     load atomic  acquire      - singlethread - global   1. buffer/global/ds/flat_load   1. buffer/global/ds/flat_load
+                               - wavefront    - local
+                                              - generic
+     load atomic  acquire      - workgroup    - global   1. buffer/global/flat_load      1. buffer/global_load glc=1
+
+                                                                                           - If CU wavefront execution mode, omit glc=1.
+
+                                                                                         2. s_waitcnt vmcnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit.
+                                                                                           - Must happen before
+                                                                                             the following buffer_gl0_inv
+                                                                                             and before any following
+                                                                                             global/generic
+                                                                                             load/load
+                                                                                             atomic/stote/store
+                                                                                             atomic/atomicrmw.
+
+                                                                                         3. buffer_gl0_inv
+
+                                                                                           - If CU wavefront execution mode, omit.
+                                                                                           - Ensures that
+                                                                                             following
+                                                                                             loads will not see
+                                                                                             stale data.
+
+     load atomic  acquire      - workgroup    - local    1. ds_load                      1. ds_load
+                                                         2. s_waitcnt lgkmcnt(0)         2. s_waitcnt lgkmcnt(0)
+
+                                                           - If OpenCL, omit.              - If OpenCL, omit.
+                                                           - Must happen before            - Must happen before
+                                                             any following                   the following buffer_gl0_inv
+                                                             global/generic                  and before any following
+                                                             load/load                       global/generic load/load
+                                                             atomic/store/store              atomic/store/store
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                           - Ensures any                   - Ensures any
+                                                             following global                following global
+                                                             data read is no                 data read is no
+                                                             older than the load             older than the load
+                                                             atomic value being              atomic value being
+                                                             acquired.                       acquired.
+
+                                                                                         3. buffer_gl0_inv
+
+                                                                                           - If CU wavefront execution mode, omit.
+                                                                                           - If OpenCL, omit.
+                                                                                           - Ensures that
+                                                                                             following
+                                                                                             loads will not see
+                                                                                             stale data.
+
+     load atomic  acquire      - workgroup    - generic  1. flat_load                    1. flat_load glc=1
+
+                                                                                           - If CU wavefront execution mode, omit glc=1.
+
+                                                         2. s_waitcnt lgkmcnt(0)         2. s_waitcnt lgkmcnt(0) &
+                                                                                            vmcnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit vmcnt.
+                                                           - If OpenCL, omit.              - If OpenCL, omit
+                                                                                             lgkmcnt(0).
+                                                           - Must happen before            - Must happen before
+                                                             any following                   the following
+                                                             global/generic                  buffer_gl0_inv and any
+                                                             load/load                       following global/generic
+                                                             atomic/store/store              load/load
+                                                             atomic/atomicrmw.               atomic/store/store
+                                                                                             atomic/atomicrmw.
+                                                           - Ensures any                   - Ensures any
+                                                             following global                following global
+                                                             data read is no                 data read is no
+                                                             older than the load             older than the load
+                                                             atomic value being              atomic value being
+                                                             acquired.                       acquired.
+
+                                                                                         3. buffer_gl0_inv
+
+                                                                                           - If CU wavefront execution mode, omit.
+                                                                                           - Ensures that
+                                                                                             following
+                                                                                             loads will not see
+                                                                                             stale data.
+
+     load atomic  acquire      - agent        - global   1. buffer/global/flat_load      1. buffer/global_load
+                               - system                     glc=1                           glc=1 dlc=1
+                                                         2. s_waitcnt vmcnt(0)           2. s_waitcnt vmcnt(0)
+
+                                                           - Must happen before            - Must happen before
+                                                             following                       following
+                                                             buffer_wbinvl1_vol.             buffer_gl*_inv.
+                                                           - Ensures the load              - Ensures the load
+                                                             has completed                   has completed
+                                                             before invalidating             before invalidating
+                                                             the cache.                      the caches.
+
+                                                         3. buffer_wbinvl1_vol           3. buffer_gl0_inv;
+                                                                                            buffer_gl1_inv
+
+                                                           - Must happen before            - Must happen before
+                                                             any following                   any following
+                                                             global/generic                  global/generic
+                                                             load/load                       load/load
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                           - Ensures that                  - Ensures that
+                                                             following                       following
+                                                             loads will not see              loads will not see
+                                                             stale global data.              stale global data.
+
+     load atomic  acquire      - agent        - generic  1. flat_load glc=1              1. flat_load glc=1 dlc=1
+                               - system                  2. s_waitcnt vmcnt(0) &         2. s_waitcnt vmcnt(0) &
+                                                            lgkmcnt(0)                      lgkmcnt(0)
+
+                                                           - If OpenCL omit                - If OpenCL omit
+                                                             lgkmcnt(0).                     lgkmcnt(0).
+                                                           - Must happen before            - Must happen before
+                                                             following                       following
+                                                             buffer_wbinvl1_vol.             buffer_gl*_invl.
+                                                           - Ensures the flat_load         - Ensures the flat_load
+                                                             has completed                   has completed
+                                                             before invalidating             before invalidating
+                                                             the cache.                      the caches.
+
+                                                         3. buffer_wbinvl1_vol           3. buffer_gl0_inv;
+                                                                                            buffer_gl1_inv
+
+                                                           - Must happen before            - Must happen before
+                                                             any following                   any following
+                                                             global/generic                  global/generic
+                                                             load/load                       load/load
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                           - Ensures that                  - Ensures that
+                                                             following loads                 following loads
+                                                             will not see stale              will not see stale
+                                                             global data.                    global data.
+
+     atomicrmw    acquire      - singlethread - global   1. buffer/global/ds/flat_atomic 1. buffer/global/ds/flat_atomic
+                               - wavefront    - local
+                                              - generic
+     atomicrmw    acquire      - workgroup    - global   1. buffer/global/flat_atomic    1. buffer/global_atomic
+                                                                                         2. s_waitcnt vm/vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit.
+                                                                                           - Use vmcnt if atomic with
+                                                                                             return and vscnt if atomic
+                                                                                             with no-return.
+                                                                                           - Must happen before
+                                                                                             the following buffer_gl0_inv
+                                                                                             and before any following
+                                                                                             global/generic
+                                                                                             load/load
+                                                                                             atomic/stote/store
+                                                                                             atomic/atomicrmw.
+
+                                                                                         3. buffer_gl0_inv
+
+                                                                                           - If CU wavefront execution mode, omit.
+                                                                                           - Ensures that
+                                                                                             following
+                                                                                             loads will not see
+                                                                                             stale data.
+
+     atomicrmw    acquire      - workgroup    - local    1. ds_atomic                    1. ds_atomic
+                                                         2. waitcnt lgkmcnt(0)           2. waitcnt lgkmcnt(0)
+
+                                                           - If OpenCL, omit.              - If OpenCL, omit.
+                                                           - Must happen before            - Must happen before
+                                                             any following                   the following
+                                                             global/generic                  buffer_gl0_inv.
+                                                             load/load
+                                                             atomic/store/store
+                                                             atomic/atomicrmw.
+                                                           - Ensures any                   - Ensures any
+                                                             following global                following global
+                                                             data read is no                 data read is no
+                                                             older than the                  older than the
+                                                             atomicrmw value                 atomicrmw value
+                                                             being acquired.                 being acquired.
+
+                                                                                         3. buffer_gl0_inv
+
+                                                                                           - If OpenCL omit.
+                                                                                           - Ensures that
+                                                                                             following
+                                                                                             loads will not see
+                                                                                             stale data.
+
+     atomicrmw    acquire      - workgroup    - generic  1. flat_atomic                  1. flat_atomic
+                                                         2. waitcnt lgkmcnt(0)           2. waitcnt lgkmcnt(0) &
+                                                                                            vm/vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit vm/vscnt.
+                                                           - If OpenCL, omit.              - If OpenCL, omit
+                                                                                             waitcnt lgkmcnt(0)..
+                                                                                           - Use vmcnt if atomic with
+                                                                                             return and vscnt if atomic
+                                                                                             with no-return.
+                                                                                             waitcnt lgkmcnt(0).
+                                                           - Must happen before            - Must happen before
+                                                             any following                   the following
+                                                             global/generic                  buffer_gl0_inv.
+                                                             load/load
+                                                             atomic/store/store
+                                                             atomic/atomicrmw.
+                                                           - Ensures any                   - Ensures any
+                                                             following global                following global
+                                                             data read is no                 data read is no
+                                                             older than the                  older than the
+                                                             atomicrmw value                 atomicrmw value
+                                                             being acquired.                 being acquired.
+
+                                                                                         3. buffer_gl0_inv
+
+                                                                                           - If CU wavefront execution mode, omit.
+                                                                                           - Ensures that
+                                                                                             following
+                                                                                             loads will not see
+                                                                                             stale data.
+
+     atomicrmw    acquire      - agent        - global   1. buffer/global/flat_atomic    1. buffer/global_atomic
+                               - system                  2. s_waitcnt vmcnt(0)           2. s_waitcnt vm/vscnt(0)
+
+                                                                                           - Use vmcnt if atomic with
+                                                                                             return and vscnt if atomic
+                                                                                             with no-return.
+                                                                                             waitcnt lgkmcnt(0).
+                                                           - Must happen before            - Must happen before
+                                                             following                       following
+                                                             buffer_wbinvl1_vol.             buffer_gl*_inv.
+                                                           - Ensures the                   - Ensures the
+                                                             atomicrmw has                   atomicrmw has
+                                                             completed before                completed before
+                                                             invalidating the                invalidating the
+                                                             cache.                          caches.
+
+                                                         3. buffer_wbinvl1_vol           3. buffer_gl0_inv;
+                                                                                            buffer_gl1_inv
+
+                                                           - Must happen before            - Must happen before
+                                                             any following                   any following
+                                                             global/generic                  global/generic
+                                                             load/load                       load/load
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                           - Ensures that                  - Ensures that
+                                                             following loads                 following loads
+                                                             will not see stale              will not see stale
+                                                             global data.                    global data.
+
+     atomicrmw    acquire      - agent        - generic  1. flat_atomic                  1. flat_atomic
+                               - system                  2. s_waitcnt vmcnt(0) &         2. s_waitcnt vm/vscnt(0) &
+                                                            lgkmcnt(0)                      lgkmcnt(0)
+
+                                                           - If OpenCL, omit               - If OpenCL, omit
+                                                             lgkmcnt(0).                     lgkmcnt(0).
+                                                                                           - Use vmcnt if atomic with
+                                                                                             return and vscnt if atomic
+                                                                                             with no-return.
+                                                           - Must happen before            - Must happen before
+                                                             following                       following
+                                                             buffer_wbinvl1_vol.             buffer_gl*_inv.
+                                                           - Ensures the                   - Ensures the
+                                                             atomicrmw has                   atomicrmw has
+                                                             completed before                completed before
+                                                             invalidating the                invalidating the
+                                                             cache.                          caches.
+
+                                                         3. buffer_wbinvl1_vol           3. buffer_gl0_inv;
+                                                                                            buffer_gl1_inv
+
+                                                           - Must happen before            - Must happen before
+                                                             any following                   any following
+                                                             global/generic                  global/generic
+                                                             load/load                       load/load
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                           - Ensures that                  - Ensures that
+                                                             following loads                 following loads
+                                                             will not see stale              will not see stale
+                                                             global data.                    global data.
+
+     fence        acquire      - singlethread *none*     *none*                          *none*
+                               - wavefront
+     fence        acquire      - workgroup    *none*     1. s_waitcnt lgkmcnt(0)         1. s_waitcnt lgkmcnt(0) &
+                                                                                            vmcnt(0) & vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit vmcnt and
+                                                                                             vscnt.
+                                                           - If OpenCL and                 - If OpenCL and
+                                                             address space is                address space is
+                                                             not generic, omit.              not generic, omit
+                                                                                             lgkmcnt(0).
+                                                                                           - If OpenCL and
+                                                                                             address space is
+                                                                                             local, omit
+                                                                                             vmcnt(0) and vscnt(0).
+                                                           - However, since LLVM           - However, since LLVM
+                                                             currently has no                currently has no
+                                                             address space on                address space on
+                                                             the fence need to               the fence need to
+                                                             conservatively                  conservatively
+                                                             always generate. If             always generate. If
+                                                             fence had an                    fence had an
+                                                             address space then              address space then
+                                                             set to address                  set to address
+                                                             space of OpenCL                 space of OpenCL
+                                                             fence flag, or to               fence flag, or to
+                                                             generic if both                 generic if both
+                                                             local and global                local and global
+                                                             flags are                       flags are
+                                                             specified.                      specified.
+                                                           - Must happen after
+                                                             any preceding
+                                                             local/generic load
+                                                             atomic/atomicrmw
+                                                             with an equal or
+                                                             wider sync scope
+                                                             and memory ordering
+                                                             stronger than
+                                                             unordered (this is
+                                                             termed the
+                                                             fence-paired-atomic).
+                                                           - Must happen before
+                                                             any following
+                                                             global/generic
+                                                             load/load
+                                                             atomic/store/store
+                                                             atomic/atomicrmw.
+                                                           - Ensures any
+                                                             following global
+                                                             data read is no
+                                                             older than the
+                                                             value read by the
+                                                             fence-paired-atomic.
+                                                                                           - Could be split into
+                                                                                             separate s_waitcnt
+                                                                                             vmcnt(0), s_waitcnt
+                                                                                             vscnt(0) and s_waitcnt
+                                                                                             lgkmcnt(0) to allow
+                                                                                             them to be
+                                                                                             independently moved
+                                                                                             according to the
+                                                                                             following rules.
+                                                                                           - s_waitcnt vmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic load
+                                                                                             atomic/
+                                                                                             atomicrmw-with-return-value
+                                                                                             with an equal or
+                                                                                             wider sync scope
+                                                                                             and memory ordering
+                                                                                             stronger than
+                                                                                             unordered (this is
+                                                                                             termed the
+                                                                                             fence-paired-atomic).
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             atomicrmw-no-return-value
+                                                                                             with an equal or
+                                                                                             wider sync scope
+                                                                                             and memory ordering
+                                                                                             stronger than
+                                                                                             unordered (this is
+                                                                                             termed the
+                                                                                             fence-paired-atomic).
+                                                                                           - s_waitcnt lgkmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             local/generic load
+                                                                                             atomic/atomicrmw
+                                                                                             with an equal or
+                                                                                             wider sync scope
+                                                                                             and memory ordering
+                                                                                             stronger than
+                                                                                             unordered (this is
+                                                                                             termed the
+                                                                                             fence-paired-atomic).
+                                                                                           - Must happen before
+                                                                                             the following
+                                                                                             buffer_gl0_inv.
+                                                                                           - Ensures that the
+                                                                                             fence-paired atomic
+                                                                                             has completed
+                                                                                             before invalidating
+                                                                                             the
+                                                                                             cache. Therefore
+                                                                                             any following
+                                                                                             locations read must
+                                                                                             be no older than
+                                                                                             the value read by
+                                                                                             the
+                                                                                             fence-paired-atomic.
+
+                                                                                         3. buffer_gl0_inv
+
+                                                                                           - If CU wavefront execution mode, omit.
+                                                                                           - Ensures that
+                                                                                             following
+                                                                                             loads will not see
+                                                                                             stale data.
+
+     fence        acquire      - agent        *none*     1. s_waitcnt lgkmcnt(0) &       1. s_waitcnt lgkmcnt(0) &
+                               - system                     vmcnt(0)                        vmcnt(0) & vscnt(0)
+
+                                                           - If OpenCL and                 - If OpenCL and
+                                                             address space is                address space is
+                                                             not generic, omit               not generic, omit
+                                                             lgkmcnt(0).                     lgkmcnt(0).
+                                                                                           - If OpenCL and
+                                                                                             address space is
+                                                                                             local, omit
+                                                                                             vmcnt(0) and vscnt(0).
+                                                           - However, since LLVM           - However, since LLVM
+                                                             currently has no                currently has no
+                                                             address space on                address space on
+                                                             the fence need to               the fence need to
+                                                             conservatively                  conservatively
+                                                             always generate                 always generate
+                                                             (see comment for                (see comment for
+                                                             previous fence).                previous fence).
+                                                           - Could be split into
+                                                             separate s_waitcnt
+                                                             vmcnt(0) and
+                                                             s_waitcnt
+                                                             lgkmcnt(0) to allow
+                                                             them to be
+                                                             independently moved
+                                                             according to the
+                                                             following rules.
+                                                           - s_waitcnt vmcnt(0)
+                                                             must happen after
+                                                             any preceding
+                                                             global/generic load
+                                                             atomic/atomicrmw
+                                                             with an equal or
+                                                             wider sync scope
+                                                             and memory ordering
+                                                             stronger than
+                                                             unordered (this is
+                                                             termed the
+                                                             fence-paired-atomic).
+                                                           - s_waitcnt lgkmcnt(0)
+                                                             must happen after
+                                                             any preceding
+                                                             local/generic load
+                                                             atomic/atomicrmw
+                                                             with an equal or
+                                                             wider sync scope
+                                                             and memory ordering
+                                                             stronger than
+                                                             unordered (this is
+                                                             termed the
+                                                             fence-paired-atomic).
+                                                           - Must happen before
+                                                             the following
+                                                             buffer_wbinvl1_vol.
+                                                           - Ensures that the
+                                                             fence-paired atomic
+                                                             has completed
+                                                             before invalidating
+                                                             the
+                                                             cache. Therefore
+                                                             any following
+                                                             locations read must
+                                                             be no older than
+                                                             the value read by
+                                                             the
+                                                             fence-paired-atomic.
+                                                                                           - Could be split into
+                                                                                             separate s_waitcnt
+                                                                                             vmcnt(0), s_waitcnt
+                                                                                             vscnt(0) and s_waitcnt
+                                                                                             lgkmcnt(0) to allow
+                                                                                             them to be
+                                                                                             independently moved
+                                                                                             according to the
+                                                                                             following rules.
+                                                                                           - s_waitcnt vmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic load
+                                                                                             atomic/
+                                                                                             atomicrmw-with-return-value
+                                                                                             with an equal or
+                                                                                             wider sync scope
+                                                                                             and memory ordering
+                                                                                             stronger than
+                                                                                             unordered (this is
+                                                                                             termed the
+                                                                                             fence-paired-atomic).
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             atomicrmw-no-return-value
+                                                                                             with an equal or
+                                                                                             wider sync scope
+                                                                                             and memory ordering
+                                                                                             stronger than
+                                                                                             unordered (this is
+                                                                                             termed the
+                                                                                             fence-paired-atomic).
+                                                                                           - s_waitcnt lgkmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             local/generic load
+                                                                                             atomic/atomicrmw
+                                                                                             with an equal or
+                                                                                             wider sync scope
+                                                                                             and memory ordering
+                                                                                             stronger than
+                                                                                             unordered (this is
+                                                                                             termed the
+                                                                                             fence-paired-atomic).
+                                                                                           - Must happen before
+                                                                                             the following
+                                                                                             buffer_gl*_inv.
+                                                                                           - Ensures that the
+                                                                                             fence-paired atomic
+                                                                                             has completed
+                                                                                             before invalidating
+                                                                                             the
+                                                                                             caches. Therefore
+                                                                                             any following
+                                                                                             locations read must
+                                                                                             be no older than
+                                                                                             the value read by
+                                                                                             the
+                                                                                             fence-paired-atomic.
+
+                                                         2. buffer_wbinvl1_vol           2. buffer_gl0_inv;
+                                                                                            buffer_gl1_inv
+
+                                                           - Must happen before any        - Must happen before any
+                                                             following global/generic        following global/generic
+                                                             load/load                       load/load
+                                                             atomic/store/store              atomic/store/store
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                           - Ensures that                  - Ensures that
+                                                             following loads                 following loads
+                                                             will not see stale              will not see stale
+                                                             global data.                    global data.
+
+     **Release Atomic**
+     ----------------------------------------------------------------------------------------------------------------------
+     store atomic release      - singlethread - global   1. buffer/global/ds/flat_store  1. buffer/global/ds/flat_store
+                               - wavefront    - local
+                                              - generic
+     store atomic release      - workgroup    - global   1. s_waitcnt lgkmcnt(0)         1. s_waitcnt lgkmcnt(0) &
+                                                                                            vmcnt(0) & vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit vmcnt and
+                                                                                             vscnt.
+                                                           - If OpenCL, omit.              - If OpenCL, omit
+                                                                                             lgkmcnt(0).
+                                                           - Must happen after
+                                                             any preceding
+                                                             local/generic
+                                                             load/store/load
+                                                             atomic/store
+                                                             atomic/atomicrmw.
+                                                                                           - Could be split into
+                                                                                             separate s_waitcnt
+                                                                                             vmcnt(0), s_waitcnt
+                                                                                             vscnt(0) and s_waitcnt
+                                                                                             lgkmcnt(0) to allow
+                                                                                             them to be
+                                                                                             independently moved
+                                                                                             according to the
+                                                                                             following rules.
+                                                                                           - s_waitcnt vmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic load/load
+                                                                                             atomic/
+                                                                                             atomicrmw-with-return-value.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store
+                                                                                             atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                                                           - s_waitcnt lgkmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             local/generic
+                                                                                             load/store/load
+                                                                                             atomic/store
+                                                                                             atomic/atomicrmw.
+                                                           - Must happen before            - Must happen before
+                                                             the following                   the following
+                                                             store.                          store.
+                                                           - Ensures that all              - Ensures that all
+                                                             memory operations               memory operations
+                                                             to local have                   have
+                                                             completed before                completed before
+                                                             performing the                  performing the
+                                                             store that is being             store that is being
+                                                             released.                       released.
+
+                                                         2. buffer/global/flat_store     2. buffer/global_store
+     store atomic release      - workgroup    - local                                    1. waitcnt vmcnt(0) & vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit.
+                                                                                           - If OpenCL, omit.
+                                                                                           - Could be split into
+                                                                                             separate s_waitcnt
+                                                                                             vmcnt(0) and s_waitcnt
+                                                                                             vscnt(0) to allow
+                                                                                             them to be
+                                                                                             independently moved
+                                                                                             according to the
+                                                                                             following rules.
+                                                                                           - s_waitcnt vmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic load/load
+                                                                                             atomic/
+                                                                                             atomicrmw-with-return-value.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                                                           - Must happen before
+                                                                                             the following
+                                                                                             store.
+                                                                                           - Ensures that all
+                                                                                             global memory
+                                                                                             operations have
+                                                                                             completed before
+                                                                                             performing the
+                                                                                             store that is being
+                                                                                             released.
+
+                                                         1. ds_store                     2. ds_store
+     store atomic release      - workgroup    - generic  1. s_waitcnt lgkmcnt(0)         1. s_waitcnt lgkmcnt(0) &
+                                                                                            vmcnt(0) & vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit vmcnt and
+                                                                                             vscnt.
+                                                           - If OpenCL, omit.              - If OpenCL, omit
+                                                                                             lgkmcnt(0).
+                                                           - Must happen after
+                                                             any preceding
+                                                             local/generic
+                                                             load/store/load
+                                                             atomic/store
+                                                             atomic/atomicrmw.
+                                                                                           - Could be split into
+                                                                                             separate s_waitcnt
+                                                                                             vmcnt(0), s_waitcnt
+                                                                                             vscnt(0) and s_waitcnt
+                                                                                             lgkmcnt(0) to allow
+                                                                                             them to be
+                                                                                             independently moved
+                                                                                             according to the
+                                                                                             following rules.
+                                                                                           - s_waitcnt vmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic load/load
+                                                                                             atomic/
+                                                                                             atomicrmw-with-return-value.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store
+                                                                                             atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                                                           - s_waitcnt lgkmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             local/generic load/store/load
+                                                                                             atomic/store atomic/atomicrmw.
+                                                           - Must happen before            - Must happen before
+                                                             the following                   the following
+                                                             store.                          store.
+                                                           - Ensures that all              - Ensures that all
+                                                             memory operations               memory operations
+                                                             to local have                   have
+                                                             completed before                completed before
+                                                             performing the                  performing the
+                                                             store that is being             store that is being
+                                                             released.                       released.
+
+                                                         2. flat_store                   2. flat_store
+     store atomic release      - agent        - global   1. s_waitcnt lgkmcnt(0) &         1. s_waitcnt lgkmcnt(0) &
+                               - system       - generic     vmcnt(0)                          vmcnt(0) & vscnt(0)
+
+                                                           - If OpenCL, omit               - If OpenCL, omit
+                                                             lgkmcnt(0).                     lgkmcnt(0).
+                                                           - Could be split into           - Could be split into
+                                                             separate s_waitcnt              separate s_waitcnt
+                                                             vmcnt(0) and                    vmcnt(0), s_waitcnt vscnt(0)
+                                                             s_waitcnt                       and s_waitcnt
+                                                             lgkmcnt(0) to allow             lgkmcnt(0) to allow
+                                                             them to be                      them to be
+                                                             independently moved             independently moved
+                                                             according to the                according to the
+                                                             following rules.                following rules.
+                                                           - s_waitcnt vmcnt(0)            - s_waitcnt vmcnt(0)
+                                                             must happen after               must happen after
+                                                             any preceding                   any preceding
+                                                             global/generic                  global/generic
+                                                             load/store/load                 load/load
+                                                             atomic/store                    atomic/
+                                                             atomic/atomicrmw.               atomicrmw-with-return-value.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                           - s_waitcnt lgkmcnt(0)          - s_waitcnt lgkmcnt(0)
+                                                             must happen after               must happen after
+                                                             any preceding                   any preceding
+                                                             local/generic                   local/generic
+                                                             load/store/load                 load/store/load
+                                                             atomic/store                    atomic/store
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                           - Must happen before            - Must happen before
+                                                             the following                   the following
+                                                             store.                          store.
+                                                           - Ensures that all              - Ensures that all
+                                                             memory operations               memory operations
+                                                             to memory have                  to memory have
+                                                             completed before                completed before
+                                                             performing the                  performing the
+                                                             store that is being             store that is being
+                                                             released.                       released.
+
+                                                         2. buffer/global/ds/flat_store  2. buffer/global/ds/flat_store
+     atomicrmw    release      - singlethread - global   1. buffer/global/ds/flat_atomic 1. buffer/global/ds/flat_atomic
+                               - wavefront    - local
+                                              - generic
+     atomicrmw    release      - workgroup    - global   1. s_waitcnt lgkmcnt(0)         1. s_waitcnt lgkmcnt(0) &
+                                                                                            vmcnt(0) & vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit vmcnt and
+                                                                                             vscnt.
+                                                           - If OpenCL, omit.
+
+                                                           - Must happen after
+                                                             any preceding
+                                                             local/generic
+                                                             load/store/load
+                                                             atomic/store
+                                                             atomic/atomicrmw.
+                                                                                           - Could be split into
+                                                                                             separate s_waitcnt
+                                                                                             vmcnt(0), s_waitcnt
+                                                                                             vscnt(0) and s_waitcnt
+                                                                                             lgkmcnt(0) to allow
+                                                                                             them to be
+                                                                                             independently moved
+                                                                                             according to the
+                                                                                             following rules.
+                                                                                           - s_waitcnt vmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic load/load
+                                                                                             atomic/
+                                                                                             atomicrmw-with-return-value.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store
+                                                                                             atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                                                           - s_waitcnt lgkmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             local/generic
+                                                                                             load/store/load
+                                                                                             atomic/store
+                                                                                             atomic/atomicrmw.
+                                                           - Must happen before            - Must happen before
+                                                             the following                   the following
+                                                             atomicrmw.                      atomicrmw.
+                                                           - Ensures that all              - Ensures that all
+                                                             memory operations               memory operations
+                                                             to local have                   have
+                                                             completed before                completed before
+                                                             performing the                  performing the
+                                                             atomicrmw that is               atomicrmw that is
+                                                             being released.                 being released.
+
+                                                         2. buffer/global/flat_atomic    2. buffer/global_atomic
+     atomicrmw    release      - workgroup    - local                                    1. waitcnt vmcnt(0) & vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit.
+                                                                                           - If OpenCL, omit.
+                                                                                           - Could be split into
+                                                                                             separate s_waitcnt
+                                                                                             vmcnt(0) and s_waitcnt
+                                                                                             vscnt(0) to allow
+                                                                                             them to be
+                                                                                             independently moved
+                                                                                             according to the
+                                                                                             following rules.
+                                                                                           - s_waitcnt vmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic load/load
+                                                                                             atomic/
+                                                                                             atomicrmw-with-return-value.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                                                           - Must happen before
+                                                                                             the following
+                                                                                             store.
+                                                                                           - Ensures that all
+                                                                                             global memory
+                                                                                             operations have
+                                                                                             completed before
+                                                                                             performing the
+                                                                                             store that is being
+                                                                                             released.
+
+                                                         1. ds_atomic                    2. ds_atomic
+     atomicrmw    release      - workgroup    - generic  1. s_waitcnt lgkmcnt(0)         1. s_waitcnt lgkmcnt(0) &
+                                                                                            vmcnt(0) & vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit vmcnt and
+                                                                                             vscnt.
+                                                           - If OpenCL, omit.              - If OpenCL, omit
+                                                                                             waitcnt lgkmcnt(0).
+                                                           - Must happen after
+                                                             any preceding
+                                                             local/generic
+                                                             load/store/load
+                                                             atomic/store
+                                                             atomic/atomicrmw.
+                                                                                           - Could be split into
+                                                                                             separate s_waitcnt
+                                                                                             vmcnt(0), s_waitcnt
+                                                                                             vscnt(0) and s_waitcnt
+                                                                                             lgkmcnt(0) to allow
+                                                                                             them to be
+                                                                                             independently moved
+                                                                                             according to the
+                                                                                             following rules.
+                                                                                           - s_waitcnt vmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic load/load
+                                                                                             atomic/
+                                                                                             atomicrmw-with-return-value.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store
+                                                                                             atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                                                           - s_waitcnt lgkmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             local/generic load/store/load
+                                                                                             atomic/store atomic/atomicrmw.
+                                                           - Must happen before            - Must happen before
+                                                             the following                   the following
+                                                             atomicrmw.                      atomicrmw.
+                                                           - Ensures that all              - Ensures that all
+                                                             memory operations               memory operations
+                                                             to local have                   have
+                                                             completed before                completed before
+                                                             performing the                  performing the
+                                                             atomicrmw that is               atomicrmw that is
+                                                             being released.                 being released.
+
+                                                         2. flat_atomic                  2. flat_atomic
+     atomicrmw    release      - agent        - global   1. s_waitcnt lgkmcnt(0) &       1. s_waitcnt lkkmcnt(0) &
+                               - system       - generic     vmcnt(0)                         vmcnt(0) & vscnt(0)
+
+                                                           - If OpenCL, omit               - If OpenCL, omit
+                                                             lgkmcnt(0).                     lgkmcnt(0).
+                                                           - Could be split into           - Could be split into
+                                                             separate s_waitcnt              separate s_waitcnt
+                                                             vmcnt(0) and                    vmcnt(0), s_waitcnt
+                                                             s_waitcnt                       vscnt(0) and s_waitcnt
+                                                             lgkmcnt(0) to allow             lgkmcnt(0) to allow
+                                                             them to be                      them to be
+                                                             independently moved             independently moved
+                                                             according to the                according to the
+                                                             following rules.                following rules.
+                                                           - s_waitcnt vmcnt(0)            - s_waitcnt vmcnt(0)
+                                                             must happen after               must happen after
+                                                             any preceding                   any preceding
+                                                             global/generic                  global/generic
+                                                             load/store/load                 load/load atomic/
+                                                             atomic/store                    atomicrmw-with-return-value.
+                                                             atomic/atomicrmw.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                           - s_waitcnt lgkmcnt(0)          - s_waitcnt lgkmcnt(0)
+                                                             must happen after               must happen after
+                                                             any preceding                   any preceding
+                                                             local/generic                   local/generic
+                                                             load/store/load                 load/store/load
+                                                             atomic/store                    atomic/store
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                           - Must happen before            - Must happen before
+                                                             the following                   the following
+                                                             atomicrmw.                      atomicrmw.
+                                                           - Ensures that all              - Ensures that all
+                                                             memory operations               memory operations
+                                                             to global and local             to global and local
+                                                             have completed                  have completed
+                                                             before performing               before performing
+                                                             the atomicrmw that              the atomicrmw that
+                                                             is being released.              is being released.
+
+                                                         2. buffer/global/ds/flat_atomic 2. buffer/global/ds/flat_atomic
+     fence        release      - singlethread *none*     *none*                          *none*
+                               - wavefront
+     fence        release      - workgroup    *none*     1. s_waitcnt lgkmcnt(0)         1. s_waitcnt lgkmcnt(0) &
+                                                                                            vmcnt(0) & vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit vmcnt and
+                                                                                             vscnt.
+                                                           - If OpenCL and                 - If OpenCL and
+                                                             address space is                address space is
+                                                             not generic, omit.              not generic, omit
+                                                                                             lgkmcnt(0).
+                                                                                           - If OpenCL and
+                                                                                             address space is
+                                                                                             local, omit
+                                                                                             vmcnt(0) and vscnt(0).
+                                                           - However, since LLVM           - However, since LLVM
+                                                             currently has no                currently has no
+                                                             address space on                address space on
+                                                             the fence need to               the fence need to
+                                                             conservatively                  conservatively
+                                                             always generate. If             always generate. If
+                                                             fence had an                    fence had an
+                                                             address space then              address space then
+                                                             set to address                  set to address
+                                                             space of OpenCL                 space of OpenCL
+                                                             fence flag, or to               fence flag, or to
+                                                             generic if both                 generic if both
+                                                             local and global                local and global
+                                                             flags are                       flags are
+                                                             specified.                      specified.
+                                                           - Must happen after
+                                                             any preceding
+                                                             local/generic
+                                                             load/load
+                                                             atomic/store/store
+                                                             atomic/atomicrmw.
+                                                                                           - Could be split into
+                                                                                             separate s_waitcnt
+                                                                                             vmcnt(0), s_waitcnt
+                                                                                             vscnt(0) and s_waitcnt
+                                                                                             lgkmcnt(0) to allow
+                                                                                             them to be
+                                                                                             independently moved
+                                                                                             according to the
+                                                                                             following rules.
+                                                                                           - s_waitcnt vmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             load/load
+                                                                                             atomic/
+                                                                                             atomicrmw-with-return-value.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                                                           - s_waitcnt lgkmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             local/generic
+                                                                                             load/store/load
+                                                                                             atomic/store atomic/
+                                                                                             atomicrmw.
+                                                           - Must happen before            - Must happen before
+                                                             any following store             any following store
+                                                             atomic/atomicrmw                atomic/atomicrmw
+                                                             with an equal or                with an equal or
+                                                             wider sync scope                wider sync scope
+                                                             and memory ordering             and memory ordering
+                                                             stronger than                   stronger than
+                                                             unordered (this is              unordered (this is
+                                                             termed the                      termed the
+                                                             fence-paired-atomic).           fence-paired-atomic).
+                                                           - Ensures that all              - Ensures that all
+                                                             memory operations               memory operations
+                                                             to local have                   have
+                                                             completed before                completed before
+                                                             performing the                  performing the
+                                                             following                       following
+                                                             fence-paired-atomic.            fence-paired-atomic.
+
+     fence        release      - agent        *none*     1. s_waitcnt lgkmcnt(0) &       1. s_waitcnt lgkmcnt(0) &
+                               - system                     vmcnt(0)                        vmcnt(0) & vscnt(0)
+
+                                                           - If OpenCL and                 - If OpenCL and
+                                                             address space is                address space is
+                                                             not generic, omit               not generic, omit
+                                                             lgkmcnt(0).                     lgkmcnt(0).
+                                                           - If OpenCL and                 - If OpenCL and
+                                                             address space is                address space is
+                                                             local, omit                     local, omit
+                                                             vmcnt(0).                       vmcnt(0) and vscnt(0).
+                                                           - However, since LLVM           - However, since LLVM
+                                                             currently has no                currently has no
+                                                             address space on                address space on
+                                                             the fence need to               the fence need to
+                                                             conservatively                  conservatively
+                                                             always generate. If             always generate. If
+                                                             fence had an                    fence had an
+                                                             address space then              address space then
+                                                             set to address                  set to address
+                                                             space of OpenCL                 space of OpenCL
+                                                             fence flag, or to               fence flag, or to
+                                                             generic if both                 generic if both
+                                                             local and global                local and global
+                                                             flags are                       flags are
+                                                             specified.                      specified.
+                                                           - Could be split into           - Could be split into
+                                                             separate s_waitcnt              separate s_waitcnt
+                                                             vmcnt(0) and                    vmcnt(0), s_waitcnt
+                                                             s_waitcnt                       vscnt(0) and s_waitcnt
+                                                             lgkmcnt(0) to allow             lgkmcnt(0) to allow
+                                                             them to be                      them to be
+                                                             independently moved             independently moved
+                                                             according to the                according to the
+                                                             following rules.                following rules.
+                                                           - s_waitcnt vmcnt(0)            - s_waitcnt vmcnt(0)
+                                                             must happen after               must happen after
+                                                             any preceding                   any preceding
+                                                             global/generic                  global/generic
+                                                             load/store/load                 load/load atomic/
+                                                             atomic/store                    atomicrmw-with-return-value.
+                                                             atomic/atomicrmw.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                           - s_waitcnt lgkmcnt(0)          - s_waitcnt lgkmcnt(0)
+                                                             must happen after               must happen after
+                                                             any preceding                   any preceding
+                                                             local/generic                   local/generic
+                                                             load/store/load                 load/store/load
+                                                             atomic/store                    atomic/store
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                           - Must happen before            - Must happen before
+                                                             any following store             any following store
+                                                             atomic/atomicrmw                atomic/atomicrmw
+                                                             with an equal or                with an equal or
+                                                             wider sync scope                wider sync scope
+                                                             and memory ordering             and memory ordering
+                                                             stronger than                   stronger than
+                                                             unordered (this is              unordered (this is
+                                                             termed the                      termed the
+                                                             fence-paired-atomic).           fence-paired-atomic).
+                                                           - Ensures that all              - Ensures that all
+                                                             memory operations               memory operations
+                                                             have                            have
+                                                             completed before                completed before
+                                                             performing the                  performing the
+                                                             following                       following
+                                                             fence-paired-atomic.            fence-paired-atomic.
+
+     **Acquire-Release Atomic**
+     ----------------------------------------------------------------------------------------------------------------------
+     atomicrmw    acq_rel      - singlethread - global   1. buffer/global/ds/flat_atomic 1. buffer/global/ds/flat_atomic
+                               - wavefront    - local
+                                              - generic
+     atomicrmw    acq_rel      - workgroup    - global   1. s_waitcnt lgkmcnt(0)         1. s_waitcnt lgkmcnt(0) &
+                                                                                            vmcnt(0) & vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit vmcnt and
+                                                                                             vscnt.
+                                                           - If OpenCL, omit.              - If OpenCL, omit
+                                                                                             s_waitcnt lgkmcnt(0).
+                                                           - Must happen after             - Must happen after
+                                                             any preceding                   any preceding
+                                                             local/generic                   local/generic
+                                                             load/store/load                 load/store/load
+                                                             atomic/store                    atomic/store
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                                                           - Could be split into
+                                                                                             separate s_waitcnt
+                                                                                             vmcnt(0), s_waitcnt
+                                                                                             vscnt(0) and s_waitcnt
+                                                                                             lgkmcnt(0) to allow
+                                                                                             them to be
+                                                                                             independently moved
+                                                                                             according to the
+                                                                                             following rules.
+                                                                                           - s_waitcnt vmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic load/load
+                                                                                             atomic/
+                                                                                             atomicrmw-with-return-value.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store
+                                                                                             atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                                                           - s_waitcnt lgkmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             local/generic load/store/load
+                                                                                             atomic/store atomic/atomicrmw.
+                                                           - Must happen before            - Must happen before
+                                                             the following                   the following
+                                                             atomicrmw.                      atomicrmw.
+                                                           - Ensures that all              - Ensures that all
+                                                             memory operations               memory operations
+                                                             to local have                   have
+                                                             completed before                completed before
+                                                             performing the                  performing the
+                                                             atomicrmw that is               atomicrmw that is
+                                                             being released.                 being released.
+
+                                                         2. buffer/global/flat_atomic    2. buffer/global_atomic
+                                                                                         3. s_waitcnt vm/vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit vm/vscnt.
+                                                                                           - Use vmcnt if atomic with
+                                                                                             return and vscnt if atomic
+                                                                                             with no-return.
+                                                                                             waitcnt lgkmcnt(0).
+                                                                                           - Must happen before
+                                                                                             the following
+                                                                                             buffer_gl0_inv.
+                                                                                           - Ensures any
+                                                                                             following global
+                                                                                             data read is no
+                                                                                             older than the
+                                                                                             atomicrmw value
+                                                                                             being acquired.
+
+                                                                                         4. buffer_gl0_inv
+
+                                                                                           - If CU wavefront execution mode, omit.
+                                                                                           - Ensures that
+                                                                                             following
+                                                                                             loads will not see
+                                                                                             stale data.
+
+     atomicrmw    acq_rel      - workgroup    - local                                    1. waitcnt vmcnt(0) & vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit.
+                                                                                           - If OpenCL, omit.
+                                                                                           - Could be split into
+                                                                                             separate s_waitcnt
+                                                                                             vmcnt(0) and s_waitcnt
+                                                                                             vscnt(0) to allow
+                                                                                             them to be
+                                                                                             independently moved
+                                                                                             according to the
+                                                                                             following rules.
+                                                                                           - s_waitcnt vmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic load/load
+                                                                                             atomic/
+                                                                                             atomicrmw-with-return-value.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                                                           - Must happen before
+                                                                                             the following
+                                                                                             store.
+                                                                                           - Ensures that all
+                                                                                             global memory
+                                                                                             operations have
+                                                                                             completed before
+                                                                                             performing the
+                                                                                             store that is being
+                                                                                             released.
+
+                                                         1. ds_atomic                    2. ds_atomic
+                                                         2. s_waitcnt lgkmcnt(0)         3. s_waitcnt lgkmcnt(0)
+
+                                                           - If OpenCL, omit.              - If OpenCL, omit.
+                                                           - Must happen before            - Must happen before
+                                                             any following                   the following
+                                                             global/generic                  buffer_gl0_inv.
+                                                             load/load
+                                                             atomic/store/store
+                                                             atomic/atomicrmw.
+                                                           - Ensures any                   - Ensures any
+                                                             following global                following global
+                                                             data read is no                 data read is no
+                                                             older than the load             older than the load
+                                                             atomic value being              atomic value being
+                                                             acquired.                       acquired.
+
+                                                                                         4. buffer_gl0_inv
+
+                                                                                           - If CU wavefront execution mode, omit.
+                                                                                           - If OpenCL omit.
+                                                                                           - Ensures that
+                                                                                             following
+                                                                                             loads will not see
+                                                                                             stale data.
+
+     atomicrmw    acq_rel      - workgroup    - generic  1. s_waitcnt lgkmcnt(0)         1. s_waitcnt lgkmcnt(0) &
+                                                                                            vmcnt(0) & vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit vmcnt and
+                                                                                             vscnt.
+                                                           - If OpenCL, omit.              - If OpenCL, omit
+                                                                                             waitcnt lgkmcnt(0).
+                                                           - Must happen after
+                                                             any preceding
+                                                             local/generic
+                                                             load/store/load
+                                                             atomic/store
+                                                             atomic/atomicrmw.
+                                                                                           - Could be split into
+                                                                                             separate s_waitcnt
+                                                                                             vmcnt(0), s_waitcnt
+                                                                                             vscnt(0) and s_waitcnt
+                                                                                             lgkmcnt(0) to allow
+                                                                                             them to be
+                                                                                             independently moved
+                                                                                             according to the
+                                                                                             following rules.
+                                                                                           - s_waitcnt vmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic load/load
+                                                                                             atomic/
+                                                                                             atomicrmw-with-return-value.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store
+                                                                                             atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                                                           - s_waitcnt lgkmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             local/generic load/store/load
+                                                                                             atomic/store atomic/atomicrmw.
+                                                           - Must happen before            - Must happen before
+                                                             the following                   the following
+                                                             atomicrmw.                      atomicrmw.
+                                                           - Ensures that all              - Ensures that all
+                                                             memory operations               memory operations
+                                                             to local have                   have
+                                                             completed before                completed before
+                                                             performing the                  performing the
+                                                             atomicrmw that is               atomicrmw that is
+                                                             being released.                 being released.
+
+                                                         2. flat_atomic                  2. flat_atomic
+                                                         3. s_waitcnt lgkmcnt(0)         3. s_waitcnt lgkmcnt(0) &
+                                                                                            vm/vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit vm/vscnt.
+                                                           - If OpenCL, omit.              - If OpenCL, omit
+                                                                                             waitcnt lgkmcnt(0).
+                                                           - Must happen before            - Must happen before
+                                                             any following                   the following
+                                                             global/generic                  buffer_gl0_inv.
+                                                             load/load
+                                                             atomic/store/store
+                                                             atomic/atomicrmw.
+                                                           - Ensures any                   - Ensures any
+                                                             following global                following global
+                                                             data read is no                 data read is no
+                                                             older than the load             older than the load
+                                                             atomic value being              atomic value being
+                                                             acquired.                       acquired.
+
+                                                                                         3. buffer_gl0_inv
+
+                                                                                           - If CU wavefront execution mode, omit.
+                                                                                           - Ensures that
+                                                                                             following
+                                                                                             loads will not see
+                                                                                             stale data.
+
+     atomicrmw    acq_rel      - agent        - global   1. s_waitcnt lgkmcnt(0) &       1. s_waitcnt lgkmcnt(0) &
+                               - system                     vmcnt(0)                        vmcnt(0) & vscnt(0)
+
+                                                           - If OpenCL, omit               - If OpenCL, omit
+                                                             lgkmcnt(0).                     lgkmcnt(0).
+                                                           - Could be split into           - Could be split into
+                                                             separate s_waitcnt              separate s_waitcnt
+                                                             vmcnt(0) and                    vmcnt(0), s_waitcnt
+                                                             s_waitcnt                       vscnt(0) and s_waitcnt
+                                                             lgkmcnt(0) to allow             lgkmcnt(0) to allow
+                                                             them to be                      them to be
+                                                             independently moved             independently moved
+                                                             according to the                according to the
+                                                             following rules.                following rules.
+                                                           - s_waitcnt vmcnt(0)            - s_waitcnt vmcnt(0)
+                                                             must happen after               must happen after
+                                                             any preceding                   any preceding
+                                                             global/generic                  global/generic
+                                                             load/store/load                 load/load atomic/
+                                                             atomic/store                    atomicrmw-with-return-value.
+                                                             atomic/atomicrmw.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                           - s_waitcnt lgkmcnt(0)          - s_waitcnt lgkmcnt(0)
+                                                             must happen after               must happen after
+                                                             any preceding                   any preceding
+                                                             local/generic                   local/generic
+                                                             load/store/load                 load/store/load
+                                                             atomic/store                    atomic/store
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                           - Must happen before            - Must happen before
+                                                             the following                   the following
+                                                             atomicrmw.                      atomicrmw.
+                                                           - Ensures that all              - Ensures that all
+                                                             memory operations               memory operations
+                                                             to global have                  to global have
+                                                             completed before                completed before
+                                                             performing the                  performing the
+                                                             atomicrmw that is               atomicrmw that is
+                                                             being released.                 being released.
+
+                                                         2. buffer/global/flat_atomic    2. buffer/global_atomic
+                                                         3. s_waitcnt vmcnt(0)           3. s_waitcnt vm/vscnt(0)
+
+                                                                                           - Use vmcnt if atomic with
+                                                                                             return and vscnt if atomic
+                                                                                             with no-return.
+                                                                                             waitcnt lgkmcnt(0).
+                                                           - Must happen before            - Must happen before
+                                                             following                       following
+                                                             buffer_wbinvl1_vol.             buffer_gl*_inv.
+                                                           - Ensures the                   - Ensures the
+                                                             atomicrmw has                   atomicrmw has
+                                                             completed before                completed before
+                                                             invalidating the                invalidating the
+                                                             cache.                          caches.
+
+                                                         4. buffer_wbinvl1_vol           4. buffer_gl0_inv;
+                                                                                            buffer_gl1_inv
+
+                                                           - Must happen before            - Must happen before
+                                                             any following                   any following
+                                                             global/generic                  global/generic
+                                                             load/load                       load/load
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                           - Ensures that                  - Ensures that
+                                                             following loads                 following loads
+                                                             will not see stale              will not see stale
+                                                             global data.                    global data.
+
+     atomicrmw    acq_rel      - agent        - generic  1. s_waitcnt lgkmcnt(0) &       1. s_waitcnt lgkmcnt(0) &
+                               - system                     vmcnt(0)                        vmcnt(0) & vscnt(0)
+
+                                                           - If OpenCL, omit               - If OpenCL, omit
+                                                             lgkmcnt(0).                     lgkmcnt(0).
+                                                           - Could be split into           - Could be split into
+                                                             separate s_waitcnt              separate s_waitcnt
+                                                             vmcnt(0) and                    vmcnt(0), s_waitcnt
+                                                             s_waitcnt                       vscnt(0) and s_waitcnt
+                                                             lgkmcnt(0) to allow             lgkmcnt(0) to allow
+                                                             them to be                      them to be
+                                                             independently moved             independently moved
+                                                             according to the                according to the
+                                                             following rules.                following rules.
+                                                           - s_waitcnt vmcnt(0)            - s_waitcnt vmcnt(0)
+                                                             must happen after               must happen after
+                                                             any preceding                   any preceding
+                                                             global/generic                  global/generic
+                                                             load/store/load                 load/load atomic
+                                                             atomic/store                    atomicrmw-with-return-value.
+                                                             atomic/atomicrmw.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                           - s_waitcnt lgkmcnt(0)          - s_waitcnt lgkmcnt(0)
+                                                             must happen after               must happen after
+                                                             any preceding                   any preceding
+                                                             local/generic                   local/generic
+                                                             load/store/load                 load/store/load
+                                                             atomic/store                    atomic/store
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                           - Must happen before            - Must happen before
+                                                             the following                   the following
+                                                             atomicrmw.                      atomicrmw.
+                                                           - Ensures that all              - Ensures that all
+                                                             memory operations               memory operations
+                                                             to global have                  have
+                                                             completed before                completed before
+                                                             performing the                  performing the
+                                                             atomicrmw that is               atomicrmw that is
+                                                             being released.                 being released.
+
+                                                         2. flat_atomic                  2. flat_atomic
+                                                         3. s_waitcnt vmcnt(0) &         3. s_waitcnt vm/vscnt(0) &
+                                                            lgkmcnt(0)                      lgkmcnt(0)
+
+                                                           - If OpenCL, omit               - If OpenCL, omit
+                                                             lgkmcnt(0).                     lgkmcnt(0).
+                                                                                           - Use vmcnt if atomic with
+                                                                                             return and vscnt if atomic
+                                                                                             with no-return.
+                                                           - Must happen before            - Must happen before
+                                                             following                       following
+                                                             buffer_wbinvl1_vol.             buffer_gl*_inv.
+                                                           - Ensures the                   - Ensures the
+                                                             atomicrmw has                   atomicrmw has
+                                                             completed before                completed before
+                                                             invalidating the                invalidating the
+                                                             cache.                          caches.
+
+                                                         4. buffer_wbinvl1_vol           4. buffer_gl0_inv;
+                                                                                            buffer_gl1_inv
+
+                                                           - Must happen before            - Must happen before
+                                                             any following                   any following
+                                                             global/generic                  global/generic
+                                                             load/load                       load/load
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                           - Ensures that                  - Ensures that
+                                                             following loads                 following loads
+                                                             will not see stale              will not see stale
+                                                             global data.                    global data.
+
+     fence        acq_rel      - singlethread *none*     *none*                          *none*
+                               - wavefront
+     fence        acq_rel      - workgroup    *none*     1. s_waitcnt lgkmcnt(0)         1. s_waitcnt lgkmcnt(0) &
+                                                                                            vmcnt(0) & vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit vmcnt and
+                                                                                             vscnt.
+                                                           - If OpenCL and                 - If OpenCL and
+                                                             address space is                address space is
+                                                             not generic, omit.              not generic, omit
+                                                                                             lgkmcnt(0).
+                                                                                           - If OpenCL and
+                                                                                             address space is
+                                                                                             local, omit
+                                                                                             vmcnt(0) and vscnt(0).
+                                                           - However,                      - However,
+                                                             since LLVM                      since LLVM
+                                                             currently has no                currently has no
+                                                             address space on                address space on
+                                                             the fence need to               the fence need to
+                                                             conservatively                  conservatively
+                                                             always generate                 always generate
+                                                             (see comment for                (see comment for
+                                                             previous fence).                previous fence).
+                                                           - Must happen after
+                                                             any preceding
+                                                             local/generic
+                                                             load/load
+                                                             atomic/store/store
+                                                             atomic/atomicrmw.
+                                                                                           - Could be split into
+                                                                                             separate s_waitcnt
+                                                                                             vmcnt(0), s_waitcnt
+                                                                                             vscnt(0) and s_waitcnt
+                                                                                             lgkmcnt(0) to allow
+                                                                                             them to be
+                                                                                             independently moved
+                                                                                             according to the
+                                                                                             following rules.
+                                                                                           - s_waitcnt vmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             load/load
+                                                                                             atomic/
+                                                                                             atomicrmw-with-return-value.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                                                           - s_waitcnt lgkmcnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             local/generic
+                                                                                             load/store/load
+                                                                                             atomic/store atomic/
+                                                                                             atomicrmw.
+                                                           - Must happen before            - Must happen before
+                                                             any following                   any following
+                                                             global/generic                  global/generic
+                                                             load/load                       load/load
+                                                             atomic/store/store              atomic/store/store
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                           - Ensures that all              - Ensures that all
+                                                             memory operations               memory operations
+                                                             to local have                   have
+                                                             completed before                completed before
+                                                             performing any                  performing any
+                                                             following global                following global
+                                                             memory operations.              memory operations.
+                                                           - Ensures that the              - Ensures that the
+                                                             preceding                       preceding
+                                                             local/generic load              local/generic load
+                                                             atomic/atomicrmw                atomic/atomicrmw
+                                                             with an equal or                with an equal or
+                                                             wider sync scope                wider sync scope
+                                                             and memory ordering             and memory ordering
+                                                             stronger than                   stronger than
+                                                             unordered (this is              unordered (this is
+                                                             termed the                      termed the
+                                                             acquire-fence-paired-atomic     acquire-fence-paired-atomic
+                                                             ) has completed                 ) has completed
+                                                             before following                before following
+                                                             global memory                   global memory
+                                                             operations. This                operations. This
+                                                             satisfies the                   satisfies the
+                                                             requirements of                 requirements of
+                                                             acquire.                        acquire.
+                                                           - Ensures that all              - Ensures that all
+                                                             previous memory                 previous memory
+                                                             operations have                 operations have
+                                                             completed before a              completed before a
+                                                             following                       following
+                                                             local/generic store             local/generic store
+                                                             atomic/atomicrmw                atomic/atomicrmw
+                                                             with an equal or                with an equal or
+                                                             wider sync scope                wider sync scope
+                                                             and memory ordering             and memory ordering
+                                                             stronger than                   stronger than
+                                                             unordered (this is              unordered (this is
+                                                             termed the                      termed the
+                                                             release-fence-paired-atomic     release-fence-paired-atomic
+                                                             ). This satisfies the           ). This satisfies the
+                                                             requirements of                 requirements of
+                                                             release.                        release.
+                                                                                           - Must happen before
+                                                                                             the following
+                                                                                             buffer_gl0_inv.
+                                                                                           - Ensures that the
+                                                                                             acquire-fence-paired
+                                                                                             atomic has completed
+                                                                                             before invalidating
+                                                                                             the
+                                                                                             cache. Therefore
+                                                                                             any following
+                                                                                             locations read must
+                                                                                             be no older than
+                                                                                             the value read by
+                                                                                             the
+                                                                                             acquire-fence-paired-atomic.
+
+                                                                                         3. buffer_gl0_inv
+
+                                                                                           - If CU wavefront execution mode, omit.
+                                                                                           - Ensures that
+                                                                                             following
+                                                                                             loads will not see
+                                                                                             stale data.
+
+     fence        acq_rel      - agent        *none*     1. s_waitcnt lgkmcnt(0) &       1. s_waitcnt lgkmcnt(0) &
+                               - system                     vmcnt(0)                        vmcnt(0) & vscnt(0)
+
+                                                           - If OpenCL and                 - If OpenCL and
+                                                             address space is                address space is
+                                                             not generic, omit               not generic, omit
+                                                             lgkmcnt(0).                     lgkmcnt(0).
+                                                                                           - If OpenCL and
+                                                                                             address space is
+                                                                                             local, omit
+                                                                                             vmcnt(0) and vscnt(0).
+                                                           - However, since LLVM           - However, since LLVM
+                                                             currently has no                currently has no
+                                                             address space on                address space on
+                                                             the fence need to               the fence need to
+                                                             conservatively                  conservatively
+                                                             always generate                 always generate
+                                                             (see comment for                (see comment for
+                                                             previous fence).                previous fence).
+                                                           - Could be split into           - Could be split into
+                                                             separate s_waitcnt              separate s_waitcnt
+                                                             vmcnt(0) and                    vmcnt(0), s_waitcnt
+                                                             s_waitcnt                       vscnt(0) and s_waitcnt
+                                                             lgkmcnt(0) to allow             lgkmcnt(0) to allow
+                                                             them to be                      them to be
+                                                             independently moved             independently moved
+                                                             according to the                according to the
+                                                             following rules.                following rules.
+                                                           - s_waitcnt vmcnt(0)            - s_waitcnt vmcnt(0)
+                                                             must happen after               must happen after
+                                                             any preceding                   any preceding
+                                                             global/generic                  global/generic
+                                                             load/store/load                 load/load
+                                                             atomic/store                    atomic/
+                                                             atomic/atomicrmw.               atomicrmw-with-return-value.
+                                                                                           - s_waitcnt vscnt(0)
+                                                                                             must happen after
+                                                                                             any preceding
+                                                                                             global/generic
+                                                                                             store/store atomic/
+                                                                                             atomicrmw-no-return-value.
+                                                           - s_waitcnt lgkmcnt(0)          - s_waitcnt lgkmcnt(0)
+                                                             must happen after               must happen after
+                                                             any preceding                   any preceding
+                                                             local/generic                   local/generic
+                                                             load/store/load                 load/store/load
+                                                             atomic/store                    atomic/store
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                           - Must happen before            - Must happen before
+                                                             the following                   the following
+                                                             buffer_wbinvl1_vol.             buffer_gl*_inv.
+                                                           - Ensures that the              - Ensures that the
+                                                             preceding                       preceding
+                                                             global/local/generic            global/local/generic
+                                                             load                            load
+                                                             atomic/atomicrmw                atomic/atomicrmw
+                                                             with an equal or                with an equal or
+                                                             wider sync scope                wider sync scope
+                                                             and memory ordering             and memory ordering
+                                                             stronger than                   stronger than
+                                                             unordered (this is              unordered (this is
+                                                             termed the                      termed the
+                                                             acquire-fence-paired-atomic     acquire-fence-paired-atomic
+                                                             ) has completed                 ) has completed
+                                                             before invalidating             before invalidating
+                                                             the cache. This                 the caches. This
+                                                             satisfies the                   satisfies the
+                                                             requirements of                 requirements of
+                                                             acquire.                        acquire.
+                                                           - Ensures that all              - Ensures that all
+                                                             previous memory                 previous memory
+                                                             operations have                 operations have
+                                                             completed before a              completed before a
+                                                             following                       following
+                                                             global/local/generic            global/local/generic
+                                                             store                           store
+                                                             atomic/atomicrmw                atomic/atomicrmw
+                                                             with an equal or                with an equal or
+                                                             wider sync scope                wider sync scope
+                                                             and memory ordering             and memory ordering
+                                                             stronger than                   stronger than
+                                                             unordered (this is              unordered (this is
+                                                             termed the                      termed the
+                                                             release-fence-paired-atomic     release-fence-paired-atomic
+                                                             ). This satisfies the           ). This satisfies the
+                                                             requirements of                 requirements of
+                                                             release.                        release.
+
+                                                         2. buffer_wbinvl1_vol           2. buffer_gl0_inv;
+                                                                                            buffer_gl1_inv
+
+                                                           - Must happen before            - Must happen before
+                                                             any following                   any following
+                                                             global/generic                  global/generic
+                                                             load/load                       load/load
+                                                             atomic/store/store              atomic/store/store
+                                                             atomic/atomicrmw.               atomic/atomicrmw.
+                                                           - Ensures that                  - Ensures that
+                                                             following loads                 following loads
+                                                             will not see stale              will not see stale
+                                                             global data. This               global data. This
+                                                             satisfies the                   satisfies the
+                                                             requirements of                 requirements of
+                                                             acquire.                        acquire.
+
+     **Sequential Consistent Atomic**
+     ----------------------------------------------------------------------------------------------------------------------
+     load atomic  seq_cst      - singlethread - global   *Same as corresponding          *Same as corresponding
+                               - wavefront    - local    load atomic acquire,            load atomic acquire,
+                                              - generic  except must generated           except must generated
+                                                         all instructions even           all instructions even
+                                                         for OpenCL.*                    for OpenCL.*
+     load atomic  seq_cst      - workgroup    - global   1. s_waitcnt lgkmcnt(0)         1. s_waitcnt lgkmcnt(0) &
+                                              - generic                                     vmcnt(0) & vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit vmcnt and
+                                                                                             vscnt.
+                                                                                           - Could be split into
+                                                                                             separate s_waitcnt
+                                                                                             vmcnt(0), s_waitcnt
+                                                                                             vscnt(0) and s_waitcnt
+                                                                                             lgkmcnt(0) to allow
+                                                                                             them to be
+                                                                                             independently moved
+                                                                                             according to the
+                                                                                             following rules.
+                                                           - Must                          - waitcnt lgkmcnt(0) must
+                                                             happen after                    happen after
+                                                             preceding                       preceding
+                                                             global/generic load             local load
+                                                             atomic/store                    atomic/store
+                                                             atomic/atomicrmw                atomic/atomicrmw
+                                                             with memory                     with memory
+                                                             ordering of seq_cst             ordering of seq_cst
+                                                             and with equal or               and with equal or
+                                                             wider sync scope.               wider sync scope.
+                                                             (Note that seq_cst              (Note that seq_cst
+                                                             fences have their               fences have their
+                                                             own s_waitcnt                   own s_waitcnt
+                                                             lgkmcnt(0) and so do            lgkmcnt(0) and so do
+                                                             not need to be                  not need to be
+                                                             considered.)                    considered.)
+                                                                                           - waitcnt vmcnt(0)
+                                                                                             Must happen after
+                                                                                             preceding
+                                                                                             global/generic load
+                                                                                             atomic/
+                                                                                             atomicrmw-with-return-value
+                                                                                             with memory
+                                                                                             ordering of seq_cst
+                                                                                             and with equal or
+                                                                                             wider sync scope.
+                                                                                             (Note that seq_cst
+                                                                                             fences have their
+                                                                                             own s_waitcnt
+                                                                                             vmcnt(0) and so do
+                                                                                             not need to be
+                                                                                             considered.)
+                                                                                           - waitcnt vscnt(0)
+                                                                                             Must happen after
+                                                                                             preceding
+                                                                                             global/generic store
+                                                                                             atomic/
+                                                                                             atomicrmw-no-return-value
+                                                                                             with memory
+                                                                                             ordering of seq_cst
+                                                                                             and with equal or
+                                                                                             wider sync scope.
+                                                                                             (Note that seq_cst
+                                                                                             fences have their
+                                                                                             own s_waitcnt
+                                                                                             vscnt(0) and so do
+                                                                                             not need to be
+                                                                                             considered.)
+                                                           - Ensures any                   - Ensures any
+                                                             preceding                       preceding
+                                                             sequential                      sequential
+                                                             consistent local                consistent global/local
+                                                             memory instructions             memory instructions
+                                                             have completed                  have completed
+                                                             before executing                before executing
+                                                             this sequentially               this sequentially
+                                                             consistent                      consistent
+                                                             instruction. This               instruction. This
+                                                             prevents reordering             prevents reordering
+                                                             a seq_cst store                 a seq_cst store
+                                                             followed by a                   followed by a
+                                                             seq_cst load. (Note             seq_cst load. (Note
+                                                             that seq_cst is                 that seq_cst is
+                                                             stronger than                   stronger than
+                                                             acquire/release as              acquire/release as
+                                                             the reordering of               the reordering of
+                                                             load acquire                    load acquire
+                                                             followed by a store             followed by a store
+                                                             release is                      release is
+                                                             prevented by the                prevented by the
+                                                             waitcnt of                      waitcnt of
+                                                             the release, but                the release, but
+                                                             there is nothing                there is nothing
+                                                             preventing a store              preventing a store
+                                                             release followed by             release followed by
+                                                             load acquire from               load acquire from
+                                                             competing out of                competing out of
+                                                             order.)                         order.)
+
+                                                         2. *Following                   2. *Following
+                                                            instructions same as            instructions same as
+                                                            corresponding load              corresponding load
+                                                            atomic acquire,                 atomic acquire,
+                                                            except must generated           except must generated
+                                                            all instructions even           all instructions even
+                                                            for OpenCL.*                    for OpenCL.*
+     load atomic  seq_cst      - workgroup    - local    *Same as corresponding
+                                                         load atomic acquire,
+                                                         except must generated
+                                                         all instructions even
+                                                         for OpenCL.*
+
+                                                                                         1. s_waitcnt vmcnt(0) & vscnt(0)
+
+                                                                                           - If CU wavefront execution mode, omit.
+                                                                                           - Could be split into
+                                                                                             separate s_waitcnt
+                                                                                             vmcnt(0) and s_waitcnt
+                                                                                             vscnt(0) to allow
+                                                                                             them to be
+                                                                                             independently moved
+                                                                                             according to the
+                                                                                             following rules.
+                                                                                           - waitcnt vmcnt(0)
+                                                                                             Must happen after
+                                                                                             preceding
+                                                                                             global/generic load
+                                                                                             atomic/
+                                                                                             atomicrmw-with-return-value
+                                                                                             with memory
+                                                                                             ordering of seq_cst
+                                                                                             and with equal or
+                                                                                             wider sync scope.
+                                                                                             (Note that seq_cst
+                                                                                             fences have their
+                                                                                             own s_waitcnt
+                                                                                             vmcnt(0) and so do
+                                                                                             not need to be
+                                                                                             considered.)
+                                                                                           - waitcnt vscnt(0)
+                                                                                             Must happen after
+                                                                                             preceding
+                                                                                             global/generic store
+                                                                                             atomic/
+                                                                                             atomicrmw-no-return-value
+                                                                                             with memory
+                                                                                             ordering of seq_cst
+                                                                                             and with equal or
+                                                                                             wider sync scope.
+                                                                                             (Note that seq_cst
+                                                                                             fences have their
+                                                                                             own s_waitcnt
+                                                                                             vscnt(0) and so do
+                                                                                             not need to be
+                                                                                             considered.)
+                                                                                           - Ensures any
+                                                                                             preceding
+                                                                                             sequential
+                                                                                             consistent global
+                                                                                             memory instructions
+                                                                                             have completed
+                                                                                             before executing
+                                                                                             this sequentially
+                                                                                             consistent
+                                                                                             instruction. This
+                                                                                             prevents reordering
+                                                                                             a seq_cst store
+                                                                                             followed by a
+                                                                                             seq_cst load. (Note
+                                                                                             that seq_cst is
+                                                                                             stronger than
+                                                                                             acquire/release as
+                                                                                             the reordering of
+                                                                                             load acquire
+                                                                                             followed by a store
+                                                                                             release is
+                                                                                             prevented by the
+                                                                                             waitcnt of
+                                                                                             the release, but
+                                                                                             there is nothing
+                                                                                             preventing a store
+                                                                                             release followed by
+                                                                                             load acquire from
+                                                                                             competing out of
+                                                                                             order.)
+
+                                                                                         2. *Following
+                                                                                            instructions same as
+                                                                                            corresponding load
+                                                                                            atomic acquire,
+                                                                                            except must generated
+                                                                                            all instructions even
+                                                                                            for OpenCL.*
+
+     load atomic  seq_cst      - agent        - global   1. s_waitcnt lgkmcnt(0) &       1. s_waitcnt lgkmcnt(0) &
+                               - system       - generic     vmcnt(0)                        vmcnt(0) & vscnt(0)
+
+                                                           - Could be split into           - Could be split into
+                                                             separate s_waitcnt              separate s_waitcnt
+                                                             vmcnt(0)                        vmcnt(0), s_waitcnt
+                                                             and s_waitcnt                   vscnt(0) and s_waitcnt
+                                                             lgkmcnt(0) to allow             lgkmcnt(0) to allow
+                                                             them to be                      them to be
+                                                             independently moved             independently moved
+                                                             according to the                according to the
+                                                             following rules.                following rules.
+                                                           - waitcnt lgkmcnt(0)            - waitcnt lgkmcnt(0)
+                                                             must happen after               must happen after
+                                                             preceding                       preceding
+                                                             global/generic load             local load
+                                                             atomic/store                    atomic/store
+                                                             atomic/atomicrmw                atomic/atomicrmw
+                                                             with memory                     with memory
+                                                             ordering of seq_cst             ordering of seq_cst
+                                                             and with equal or               and with equal or
+                                                             wider sync scope.               wider sync scope.
+                                                             (Note that seq_cst              (Note that seq_cst
+                                                             fences have their               fences have their
+                                                             own s_waitcnt                   own s_waitcnt
+                                                             lgkmcnt(0) and so do            lgkmcnt(0) and so do
+                                                             not need to be                  not need to be
+                                                             considered.)                    considered.)
+                                                           - waitcnt vmcnt(0)              - waitcnt vmcnt(0)
+                                                             must happen after               must happen after
+                                                             preceding                       preceding
+                                                             global/generic load             global/generic load
+                                                             atomic/store                    atomic/
+                                                             atomic/atomicrmw                atomicrmw-with-return-value
+                                                             with memory                     with memory
+                                                             ordering of seq_cst             ordering of seq_cst
+                                                             and with equal or               and with equal or
+                                                             wider sync scope.               wider sync scope.
+                                                             (Note that seq_cst              (Note that seq_cst
+                                                             fences have their               fences have their
+                                                             own s_waitcnt                   own s_waitcnt
+                                                             vmcnt(0) and so do              vmcnt(0) and so do
+                                                             not need to be                  not need to be
+                                                             considered.)                    considered.)
+                                                                                           - waitcnt vscnt(0)
+                                                                                             Must happen after
+                                                                                             preceding
+                                                                                             global/generic store
+                                                                                             atomic/
+                                                                                             atomicrmw-no-return-value
+                                                                                             with memory
+                                                                                             ordering of seq_cst
+                                                                                             and with equal or
+                                                                                             wider sync scope.
+                                                                                             (Note that seq_cst
+                                                                                             fences have their
+                                                                                             own s_waitcnt
+                                                                                             vscnt(0) and so do
+                                                                                             not need to be
+                                                                                             considered.)
+                                                           - Ensures any                   - Ensures any
+                                                             preceding                       preceding
+                                                             sequential                      sequential
+                                                             consistent global               consistent global
+                                                             memory instructions             memory instructions
+                                                             have completed                  have completed
+                                                             before executing                before executing
+                                                             this sequentially               this sequentially
+                                                             consistent                      consistent
+                                                             instruction. This               instruction. This
+                                                             prevents reordering             prevents reordering
+                                                             a seq_cst store                 a seq_cst store
+                                                             followed by a                   followed by a
+                                                             seq_cst load. (Note             seq_cst load. (Note
+                                                             that seq_cst is                 that seq_cst is
+                                                             stronger than                   stronger than
+                                                             acquire/release as              acquire/release as
+                                                             the reordering of               the reordering of
+                                                             load acquire                    load acquire
+                                                             followed by a store             followed by a store
+                                                             release is                      release is
+                                                             prevented by the                prevented by the
+                                                             waitcnt of                      waitcnt of
+                                                             the release, but                the release, but
+                                                             there is nothing                there is nothing
+                                                             preventing a store              preventing a store
+                                                             release followed by             release followed by
+                                                             load acquire from               load acquire from
+                                                             competing out of                competing out of
+                                                             order.)                         order.)
+
+                                                         2. *Following                   2. *Following
+                                                            instructions same as            instructions same as
+                                                            corresponding load              corresponding load
+                                                            atomic acquire,                 atomic acquire,
+                                                            except must generated           except must generated
+                                                            all instructions even           all instructions even
+                                                            for OpenCL.*                    for OpenCL.*
+     store atomic seq_cst      - singlethread - global   *Same as corresponding          *Same as corresponding
+                               - wavefront    - local    store atomic release,           store atomic release,
+                               - workgroup    - generic  except must generated           except must generated
+                                                         all instructions even           all instructions even
+                                                         for OpenCL.*                    for OpenCL.*
+     store atomic seq_cst      - agent        - global   *Same as corresponding          *Same as corresponding
+                               - system       - generic  store atomic release,           store atomic release,
+                                                         except must generated           except must generated
+                                                         all instructions even           all instructions even
+                                                         for OpenCL.*                    for OpenCL.*
+     atomicrmw    seq_cst      - singlethread - global   *Same as corresponding          *Same as corresponding
+                               - wavefront    - local    atomicrmw acq_rel,              atomicrmw acq_rel,
+                               - workgroup    - generic  except must generated           except must generated
+                                                         all instructions even           all instructions even
+                                                         for OpenCL.*                    for OpenCL.*
+     atomicrmw    seq_cst      - agent        - global   *Same as corresponding          *Same as corresponding
+                               - system       - generic  atomicrmw acq_rel,              atomicrmw acq_rel,
+                                                         except must generated           except must generated
+                                                         all instructions even           all instructions even
+                                                         for OpenCL.*                    for OpenCL.*
+     fence        seq_cst      - singlethread *none*     *Same as corresponding          *Same as corresponding
+                               - wavefront               fence acq_rel,                  fence acq_rel,
+                               - workgroup               except must generated           except must generated
+                               - agent                   all instructions even           all instructions even
+                               - system                  for OpenCL.*                    for OpenCL.*
+     ============ ============ ============== ========== =============================== ==================================
+
+The memory order also adds the single thread optimization constrains defined in
+table
+:ref:`amdgpu-amdhsa-memory-model-single-thread-optimization-constraints-gfx6-gfx10-table`.
+
+  .. table:: AMDHSA Memory Model Single Thread Optimization Constraints GFX6-GFX10
+     :name: amdgpu-amdhsa-memory-model-single-thread-optimization-constraints-gfx6-gfx10-table
+
+     ============ ==============================================================
+     LLVM Memory  Optimization Constraints
+     Ordering
+     ============ ==============================================================
+     unordered    *none*
+     monotonic    *none*
+     acquire      - If a load atomic/atomicrmw then no following load/load
+                    atomic/store/ store atomic/atomicrmw/fence instruction can
+                    be moved before the acquire.
+                  - If a fence then same as load atomic, plus no preceding
+                    associated fence-paired-atomic can be moved after the fence.
+     release      - If a store atomic/atomicrmw then no preceding load/load
+                    atomic/store/ store atomic/atomicrmw/fence instruction can
+                    be moved after the release.
+                  - If a fence then same as store atomic, plus no following
+                    associated fence-paired-atomic can be moved before the
+                    fence.
+     acq_rel      Same constraints as both acquire and release.
+     seq_cst      - If a load atomic then same constraints as acquire, plus no
+                    preceding sequentially consistent load atomic/store
+                    atomic/atomicrmw/fence instruction can be moved after the
+                    seq_cst.
+                  - If a store atomic then the same constraints as release, plus
+                    no following sequentially consistent load atomic/store
+                    atomic/atomicrmw/fence instruction can be moved before the
+                    seq_cst.
+                  - If an atomicrmw/fence then same constraints as acq_rel.
+     ============ ==============================================================
+
+Trap Handler ABI
+~~~~~~~~~~~~~~~~
+
+For code objects generated by AMDGPU backend for HSA [HSA]_ compatible runtimes
+(such as ROCm [AMD-ROCm]_), the runtime installs a trap handler that supports
+the ``s_trap`` instruction with the following usage:
+
+  .. table:: AMDGPU Trap Handler for AMDHSA OS
+     :name: amdgpu-trap-handler-for-amdhsa-os-table
+
+     =================== =============== =============== =======================
+     Usage               Code Sequence   Trap Handler    Description
+                                         Inputs
+     =================== =============== =============== =======================
+     reserved            ``s_trap 0x00``                 Reserved by hardware.
+     ``debugtrap(arg)``  ``s_trap 0x01`` ``SGPR0-1``:    Reserved for HSA
+                                           ``queue_ptr`` ``debugtrap``
+                                         ``VGPR0``:      intrinsic (not
+                                           ``arg``       implemented).
+     ``llvm.trap``       ``s_trap 0x02`` ``SGPR0-1``:    Causes dispatch to be
+                                           ``queue_ptr`` terminated and its
+                                                         associated queue put
+                                                         into the error state.
+     ``llvm.debugtrap``  ``s_trap 0x03``                 - If debugger not
+                                                           installed then
+                                                           behaves as a
+                                                           no-operation. The
+                                                           trap handler is
+                                                           entered and
+                                                           immediately returns
+                                                           to continue
+                                                           execution of the
+                                                           wavefront.
+                                                         - If the debugger is
+                                                           installed, causes
+                                                           the debug trap to be
+                                                           reported by the
+                                                           debugger and the
+                                                           wavefront is put in
+                                                           the halt state until
+                                                           resumed by the
+                                                           debugger.
+     reserved            ``s_trap 0x04``                 Reserved.
+     reserved            ``s_trap 0x05``                 Reserved.
+     reserved            ``s_trap 0x06``                 Reserved.
+     debugger breakpoint ``s_trap 0x07``                 Reserved for debugger
+                                                         breakpoints.
+     reserved            ``s_trap 0x08``                 Reserved.
+     reserved            ``s_trap 0xfe``                 Reserved.
+     reserved            ``s_trap 0xff``                 Reserved.
+     =================== =============== =============== =======================
+
+AMDPAL
+------
+
+This section provides code conventions used when the target triple OS is
+``amdpal`` (see :ref:`amdgpu-target-triples`) for passing runtime parameters
+from the application/runtime to each invocation of a hardware shader. These
+parameters include both generic, application-controlled parameters called
+*user data* as well as system-generated parameters that are a product of the
+draw or dispatch execution.
+
+User Data
+~~~~~~~~~
+
+Each hardware stage has a set of 32-bit *user data registers* which can be
+written from a command buffer and then loaded into SGPRs when waves are launched
+via a subsequent dispatch or draw operation. This is the way most arguments are
+passed from the application/runtime to a hardware shader.
+
+Compute User Data
+~~~~~~~~~~~~~~~~~
+
+Compute shader user data mappings are simpler than graphics shaders, and have a
+fixed mapping.
+
+Note that there are always 10 available *user data entries* in registers -
+entries beyond that limit must be fetched from memory (via the spill table
+pointer) by the shader.
+
+  .. table:: PAL Compute Shader User Data Registers
+     :name: pal-compute-user-data-registers
+
+     ============= ================================
+     User Register Description
+     ============= ================================
+     0             Global Internal Table (32-bit pointer)
+     1             Per-Shader Internal Table (32-bit pointer)
+     2 - 11        Application-Controlled User Data (10 32-bit values)
+     12            Spill Table (32-bit pointer)
+     13 - 14       Thread Group Count (64-bit pointer)
+     15            GDS Range
+     ============= ================================
+
+Graphics User Data
+~~~~~~~~~~~~~~~~~~
+
+Graphics pipelines support a much more flexible user data mapping:
+
+  .. table:: PAL Graphics Shader User Data Registers
+     :name: pal-graphics-user-data-registers
+
+     ============= ================================
+     User Register Description
+     ============= ================================
+     0             Global Internal Table (32-bit pointer)
+     +             Per-Shader Internal Table (32-bit pointer)
+     + 1-15        Application Controlled User Data
+                   (1-15 Contiguous 32-bit Values in Registers)
+     +             Spill Table (32-bit pointer)
+     +             Draw Index (First Stage Only)
+     +             Vertex Offset (First Stage Only)
+     +             Instance Offset (First Stage Only)
+     ============= ================================
+
+  The placement of the global internal table remains fixed in the first *user
+  data SGPR register*. Otherwise all parameters are optional, and can be mapped
+  to any desired *user data SGPR register*, with the following regstrictions:
+
+  * Draw Index, Vertex Offset, and Instance Offset can only be used by the first
+    activehardware stage in a graphics pipeline (i.e. where the API vertex
+    shader runs).
+
+  * Application-controlled user data must be mapped into a contiguous range of
+    user data registers.
+
+  * The application-controlled user data range supports compaction remapping, so
+    only *entries* that are actually consumed by the shader must be assigned to
+    corresponding *registers*. Note that in order to support an efficient runtime
+    implementation, the remapping must pack *registers* in the same order as
+    *entries*, with unused *entries* removed.
+
+.. _pal_global_internal_table:
+
+Global Internal Table
+~~~~~~~~~~~~~~~~~~~~~
+
+The global internal table is a table of *shader resource descriptors* (SRDs) that
+define how certain engine-wide, runtime-managed resources should be accessed
+from a shader. The majority of these resources have HW-defined formats, and it
+is up to the compiler to write/read data as required by the target hardware.
+
+The following table illustrates the required format:
+
+  .. table:: PAL Global Internal Table
+     :name: pal-git-table
+
+     ============= ================================
+     Offset        Description
+     ============= ================================
+     0-3           Graphics Scratch SRD
+     4-7           Compute Scratch SRD
+     8-11          ES/GS Ring Output SRD
+     12-15         ES/GS Ring Input SRD
+     16-19         GS/VS Ring Output #0
+     20-23         GS/VS Ring Output #1
+     24-27         GS/VS Ring Output #2
+     28-31         GS/VS Ring Output #3
+     32-35         GS/VS Ring Input SRD
+     36-39         Tessellation Factor Buffer SRD
+     40-43         Off-Chip LDS Buffer SRD
+     44-47         Off-Chip Param Cache Buffer SRD
+     48-51         Sample Position Buffer SRD
+     52            vaRange::ShadowDescriptorTable High Bits
+     ============= ================================
+
+  The pointer to the global internal table passed to the shader as user data
+  is a 32-bit pointer. The top 32 bits should be assumed to be the same as
+  the top 32 bits of the pipeline, so the shader may use the program
+  counter's top 32 bits.
+
+Unspecified OS
+--------------
+
+This section provides code conventions used when the target triple OS is
+empty (see :ref:`amdgpu-target-triples`).
+
+Trap Handler ABI
+~~~~~~~~~~~~~~~~
+
+For code objects generated by AMDGPU backend for non-amdhsa OS, the runtime does
+not install a trap handler. The ``llvm.trap`` and ``llvm.debugtrap``
+instructions are handled as follows:
+
+  .. table:: AMDGPU Trap Handler for Non-AMDHSA OS
+     :name: amdgpu-trap-handler-for-non-amdhsa-os-table
+
+     =============== =============== ===========================================
+     Usage           Code Sequence   Description
+     =============== =============== ===========================================
+     llvm.trap       s_endpgm        Causes wavefront to be terminated.
+     llvm.debugtrap  *none*          Compiler warning given that there is no
+                                     trap handler installed.
+     =============== =============== ===========================================
+
+Source Languages
+================
+
+.. _amdgpu-opencl:
+
+OpenCL
+------
+
+When the language is OpenCL the following differences occur:
+
+1. The OpenCL memory model is used (see :ref:`amdgpu-amdhsa-memory-model`).
+2. The AMDGPU backend appends additional arguments to the kernel's explicit
+   arguments for the AMDHSA OS (see
+   :ref:`opencl-kernel-implicit-arguments-appended-for-amdhsa-os-table`).
+3. Additional metadata is generated
+   (see :ref:`amdgpu-amdhsa-code-object-metadata`).
+
+  .. table:: OpenCL kernel implicit arguments appended for AMDHSA OS
+     :name: opencl-kernel-implicit-arguments-appended-for-amdhsa-os-table
+
+     ======== ==== ========= ===========================================
+     Position Byte Byte      Description
+              Size Alignment
+     ======== ==== ========= ===========================================
+     1        8    8         OpenCL Global Offset X
+     2        8    8         OpenCL Global Offset Y
+     3        8    8         OpenCL Global Offset Z
+     4        8    8         OpenCL address of printf buffer
+     5        8    8         OpenCL address of virtual queue used by
+                             enqueue_kernel.
+     6        8    8         OpenCL address of AqlWrap struct used by
+                             enqueue_kernel.
+     7        8    8         Pointer argument used for Multi-gird
+                             synchronization.
+     ======== ==== ========= ===========================================
+
+.. _amdgpu-hcc:
+
+HCC
+---
+
+When the language is HCC the following differences occur:
+
+1. The HSA memory model is used (see :ref:`amdgpu-amdhsa-memory-model`).
+
+.. _amdgpu-assembler:
+
+Assembler
+---------
+
+AMDGPU backend has LLVM-MC based assembler which is currently in development.
+It supports AMDGCN GFX6-GFX10.
+
+This section describes general syntax for instructions and operands.
+
+Instructions
+~~~~~~~~~~~~
+
+.. toctree::
+   :hidden:
+
+   AMDGPU/AMDGPUAsmGFX7
+   AMDGPU/AMDGPUAsmGFX8
+   AMDGPU/AMDGPUAsmGFX9
+   AMDGPU/AMDGPUAsmGFX10
+   AMDGPUModifierSyntax
+   AMDGPUOperandSyntax
+   AMDGPUInstructionSyntax
+   AMDGPUInstructionNotation
+
+An instruction has the following :doc:`syntax<AMDGPUInstructionSyntax>`:
+
+    ``<``\ *opcode*\ ``>    <``\ *operand0*\ ``>, <``\ *operand1*\ ``>,...    <``\ *modifier0*\ ``> <``\ *modifier1*\ ``>...``
+
+:doc:`Operands<AMDGPUOperandSyntax>` are normally comma-separated while
+:doc:`modifiers<AMDGPUModifierSyntax>` are space-separated.
+
+The order of *operands* and *modifiers* is fixed.
+Most *modifiers* are optional and may be omitted.
+
+See detailed instruction syntax description for :doc:`GFX7<AMDGPU/AMDGPUAsmGFX7>`,
+:doc:`GFX8<AMDGPU/AMDGPUAsmGFX8>`, :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>`
+and :doc:`GFX10<AMDGPU/AMDGPUAsmGFX10>`.
+
+Note that features under development are not included in this description.
+
+For more information about instructions, their semantics and supported combinations of
+operands, refer to one of instruction set architecture manuals
+[AMD-GCN-GFX6]_, [AMD-GCN-GFX7]_, [AMD-GCN-GFX8]_, [AMD-GCN-GFX9]_ and
+[AMD-GCN-GFX10]_.
+
+Operands
+~~~~~~~~
+
+Detailed description of operands may be found :doc:`here<AMDGPUOperandSyntax>`.
+
+Modifiers
+~~~~~~~~~
+
+Detailed description of modifiers may be found :doc:`here<AMDGPUModifierSyntax>`.
+
+Instruction Examples
+~~~~~~~~~~~~~~~~~~~~
+
+DS
+++
+
+.. code-block:: nasm
+
+  ds_add_u32 v2, v4 offset:16
+  ds_write_src2_b64 v2 offset0:4 offset1:8
+  ds_cmpst_f32 v2, v4, v6
+  ds_min_rtn_f64 v[8:9], v2, v[4:5]
+
+
+For full list of supported instructions, refer to "LDS/GDS instructions" in ISA Manual.
+
+FLAT
+++++
+
+.. code-block:: nasm
+
+  flat_load_dword v1, v[3:4]
+  flat_store_dwordx3 v[3:4], v[5:7]
+  flat_atomic_swap v1, v[3:4], v5 glc
+  flat_atomic_cmpswap v1, v[3:4], v[5:6] glc slc
+  flat_atomic_fmax_x2 v[1:2], v[3:4], v[5:6] glc
+
+For full list of supported instructions, refer to "FLAT instructions" in ISA Manual.
+
+MUBUF
++++++
+
+.. code-block:: nasm
+
+  buffer_load_dword v1, off, s[4:7], s1
+  buffer_store_dwordx4 v[1:4], v2, ttmp[4:7], s1 offen offset:4 glc tfe
+  buffer_store_format_xy v[1:2], off, s[4:7], s1
+  buffer_wbinvl1
+  buffer_atomic_inc v1, v2, s[8:11], s4 idxen offset:4 slc
+
+For full list of supported instructions, refer to "MUBUF Instructions" in ISA Manual.
+
+SMRD/SMEM
++++++++++
+
+.. code-block:: nasm
+
+  s_load_dword s1, s[2:3], 0xfc
+  s_load_dwordx8 s[8:15], s[2:3], s4
+  s_load_dwordx16 s[88:103], s[2:3], s4
+  s_dcache_inv_vol
+  s_memtime s[4:5]
+
+For full list of supported instructions, refer to "Scalar Memory Operations" in ISA Manual.
+
+SOP1
+++++
+
+.. code-block:: nasm
+
+  s_mov_b32 s1, s2
+  s_mov_b64 s[0:1], 0x80000000
+  s_cmov_b32 s1, 200
+  s_wqm_b64 s[2:3], s[4:5]
+  s_bcnt0_i32_b64 s1, s[2:3]
+  s_swappc_b64 s[2:3], s[4:5]
+  s_cbranch_join s[4:5]
+
+For full list of supported instructions, refer to "SOP1 Instructions" in ISA Manual.
+
+SOP2
+++++
+
+.. code-block:: nasm
+
+  s_add_u32 s1, s2, s3
+  s_and_b64 s[2:3], s[4:5], s[6:7]
+  s_cselect_b32 s1, s2, s3
+  s_andn2_b32 s2, s4, s6
+  s_lshr_b64 s[2:3], s[4:5], s6
+  s_ashr_i32 s2, s4, s6
+  s_bfm_b64 s[2:3], s4, s6
+  s_bfe_i64 s[2:3], s[4:5], s6
+  s_cbranch_g_fork s[4:5], s[6:7]
+
+For full list of supported instructions, refer to "SOP2 Instructions" in ISA Manual.
+
+SOPC
+++++
+
+.. code-block:: nasm
+
+  s_cmp_eq_i32 s1, s2
+  s_bitcmp1_b32 s1, s2
+  s_bitcmp0_b64 s[2:3], s4
+  s_setvskip s3, s5
+
+For full list of supported instructions, refer to "SOPC Instructions" in ISA Manual.
+
+SOPP
+++++
+
+.. code-block:: nasm
+
+  s_barrier
+  s_nop 2
+  s_endpgm
+  s_waitcnt 0 ; Wait for all counters to be 0
+  s_waitcnt vmcnt(0) & expcnt(0) & lgkmcnt(0) ; Equivalent to above
+  s_waitcnt vmcnt(1) ; Wait for vmcnt counter to be 1.
+  s_sethalt 9
+  s_sleep 10
+  s_sendmsg 0x1
+  s_sendmsg sendmsg(MSG_INTERRUPT)
+  s_trap 1
+
+For full list of supported instructions, refer to "SOPP Instructions" in ISA Manual.
+
+Unless otherwise mentioned, little verification is performed on the operands
+of SOPP Instructions, so it is up to the programmer to be familiar with the
+range or acceptable values.
+
+VALU
+++++
+
+For vector ALU instruction opcodes (VOP1, VOP2, VOP3, VOPC, VOP_DPP, VOP_SDWA),
+the assembler will automatically use optimal encoding based on its operands.
+To force specific encoding, one can add a suffix to the opcode of the instruction:
+
+* _e32 for 32-bit VOP1/VOP2/VOPC
+* _e64 for 64-bit VOP3
+* _dpp for VOP_DPP
+* _sdwa for VOP_SDWA
+
+VOP1/VOP2/VOP3/VOPC examples:
+
+.. code-block:: nasm
+
+  v_mov_b32 v1, v2
+  v_mov_b32_e32 v1, v2
+  v_nop
+  v_cvt_f64_i32_e32 v[1:2], v2
+  v_floor_f32_e32 v1, v2
+  v_bfrev_b32_e32 v1, v2
+  v_add_f32_e32 v1, v2, v3
+  v_mul_i32_i24_e64 v1, v2, 3
+  v_mul_i32_i24_e32 v1, -3, v3
+  v_mul_i32_i24_e32 v1, -100, v3
+  v_addc_u32 v1, s[0:1], v2, v3, s[2:3]
+  v_max_f16_e32 v1, v2, v3
+
+VOP_DPP examples:
+
+.. code-block:: nasm
+
+  v_mov_b32 v0, v0 quad_perm:[0,2,1,1]
+  v_sin_f32 v0, v0 row_shl:1 row_mask:0xa bank_mask:0x1 bound_ctrl:0
+  v_mov_b32 v0, v0 wave_shl:1
+  v_mov_b32 v0, v0 row_mirror
+  v_mov_b32 v0, v0 row_bcast:31
+  v_mov_b32 v0, v0 quad_perm:[1,3,0,1] row_mask:0xa bank_mask:0x1 bound_ctrl:0
+  v_add_f32 v0, v0, |v0| row_shl:1 row_mask:0xa bank_mask:0x1 bound_ctrl:0
+  v_max_f16 v1, v2, v3 row_shl:1 row_mask:0xa bank_mask:0x1 bound_ctrl:0
+
+VOP_SDWA examples:
+
+.. code-block:: nasm
+
+  v_mov_b32 v1, v2 dst_sel:BYTE_0 dst_unused:UNUSED_PRESERVE src0_sel:DWORD
+  v_min_u32 v200, v200, v1 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD
+  v_sin_f32 v0, v0 dst_unused:UNUSED_PAD src0_sel:WORD_1
+  v_fract_f32 v0, |v0| dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_1
+  v_cmpx_le_u32 vcc, v1, v2 src0_sel:BYTE_2 src1_sel:WORD_0
+
+For full list of supported instructions, refer to "Vector ALU instructions".
+
+.. TODO
+   Remove once we switch to code object v3 by default.
+
+.. _amdgpu-amdhsa-assembler-predefined-symbols-v2:
+
+Code Object V2 Predefined Symbols (-mattr=-code-object-v3)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. warning:: Code Object V2 is not the default code object version emitted by
+  this version of LLVM. For a description of the predefined symbols available
+  with the default configuration (Code Object V3) see
+  :ref:`amdgpu-amdhsa-assembler-predefined-symbols-v3`.
+
+The AMDGPU assembler defines and updates some symbols automatically. These
+symbols do not affect code generation.
+
+.option.machine_version_major
++++++++++++++++++++++++++++++
+
+Set to the GFX major generation number of the target being assembled for. For
+example, when assembling for a "GFX9" target this will be set to the integer
+value "9". The possible GFX major generation numbers are presented in
+:ref:`amdgpu-processors`.
+
+.option.machine_version_minor
++++++++++++++++++++++++++++++
+
+Set to the GFX minor generation number of the target being assembled for. For
+example, when assembling for a "GFX810" target this will be set to the integer
+value "1". The possible GFX minor generation numbers are presented in
+:ref:`amdgpu-processors`.
+
+.option.machine_version_stepping
+++++++++++++++++++++++++++++++++
+
+Set to the GFX stepping generation number of the target being assembled for.
+For example, when assembling for a "GFX704" target this will be set to the
+integer value "4". The possible GFX stepping generation numbers are presented
+in :ref:`amdgpu-processors`.
+
+.kernel.vgpr_count
+++++++++++++++++++
+
+Set to zero each time a
+:ref:`amdgpu-amdhsa-assembler-directive-amdgpu_hsa_kernel` directive is
+encountered. At each instruction, if the current value of this symbol is less
+than or equal to the maximum VPGR number explicitly referenced within that
+instruction then the symbol value is updated to equal that VGPR number plus
+one.
+
+.kernel.sgpr_count
+++++++++++++++++++
+
+Set to zero each time a
+:ref:`amdgpu-amdhsa-assembler-directive-amdgpu_hsa_kernel` directive is
+encountered. At each instruction, if the current value of this symbol is less
+than or equal to the maximum VPGR number explicitly referenced within that
+instruction then the symbol value is updated to equal that SGPR number plus
+one.
+
+.. _amdgpu-amdhsa-assembler-directives-v2:
+
+Code Object V2 Directives (-mattr=-code-object-v3)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. warning:: Code Object V2 is not the default code object version emitted by
+  this version of LLVM. For a description of the directives supported with
+  the default configuration (Code Object V3) see
+  :ref:`amdgpu-amdhsa-assembler-directives-v3`.
+
+AMDGPU ABI defines auxiliary data in output code object. In assembly source,
+one can specify them with assembler directives.
+
+.hsa_code_object_version major, minor
++++++++++++++++++++++++++++++++++++++
+
+*major* and *minor* are integers that specify the version of the HSA code
+object that will be generated by the assembler.
+
+.hsa_code_object_isa [major, minor, stepping, vendor, arch]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+
+*major*, *minor*, and *stepping* are all integers that describe the instruction
+set architecture (ISA) version of the assembly program.
+
+*vendor* and *arch* are quoted strings.  *vendor* should always be equal to
+"AMD" and *arch* should always be equal to "AMDGPU".
+
+By default, the assembler will derive the ISA version, *vendor*, and *arch*
+from the value of the -mcpu option that is passed to the assembler.
+
+.. _amdgpu-amdhsa-assembler-directive-amdgpu_hsa_kernel:
+
+.amdgpu_hsa_kernel (name)
++++++++++++++++++++++++++
+
+This directives specifies that the symbol with given name is a kernel entry point
+(label) and the object should contain corresponding symbol of type STT_AMDGPU_HSA_KERNEL.
+
+.amd_kernel_code_t
+++++++++++++++++++
+
+This directive marks the beginning of a list of key / value pairs that are used
+to specify the amd_kernel_code_t object that will be emitted by the assembler.
+The list must be terminated by the *.end_amd_kernel_code_t* directive.  For
+any amd_kernel_code_t values that are unspecified a default value will be
+used.  The default value for all keys is 0, with the following exceptions:
+
+- *amd_code_version_major* defaults to 1.
+- *amd_kernel_code_version_minor* defaults to 2.
+- *amd_machine_kind* defaults to 1.
+- *amd_machine_version_major*, *machine_version_minor*, and
+  *amd_machine_version_stepping* are derived from the value of the -mcpu option
+  that is passed to the assembler.
+- *kernel_code_entry_byte_offset* defaults to 256.
+- *wavefront_size* defaults 6 for all targets before GFX10. For GFX10 onwards
+  defaults to 6 if target feature ``wavefrontsize64`` is enabled, otherwise 5.
+  Note that wavefront size is specified as a power of two, so a value of **n**
+  means a size of 2^ **n**.
+- *call_convention* defaults to -1.
+- *kernarg_segment_alignment*, *group_segment_alignment*, and
+  *private_segment_alignment* default to 4. Note that alignments are specified
+  as a power of 2, so a value of **n** means an alignment of 2^ **n**.
+- *enable_wgp_mode* defaults to 1 if target feature ``cumode`` is disabled for
+  GFX10 onwards.
+- *enable_mem_ordered* defaults to 1 for GFX10 onwards.
+
+The *.amd_kernel_code_t* directive must be placed immediately after the
+function label and before any instructions.
+
+For a full list of amd_kernel_code_t keys, refer to AMDGPU ABI document,
+comments in lib/Target/AMDGPU/AmdKernelCodeT.h and test/CodeGen/AMDGPU/hsa.s.
+
+.. _amdgpu-amdhsa-assembler-example-v2:
+
+Code Object V2 Example Source Code (-mattr=-code-object-v3)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. warning:: Code Object V2 is not the default code object version emitted by
+  this version of LLVM. For a description of the directives supported with
+  the default configuration (Code Object V3) see
+  :ref:`amdgpu-amdhsa-assembler-example-v3`.
+
+Here is an example of a minimal assembly source file, defining one HSA kernel:
+
+.. code-block:: none
+
+   .hsa_code_object_version 1,0
+   .hsa_code_object_isa
+
+   .hsatext
+   .globl  hello_world
+   .p2align 8
+   .amdgpu_hsa_kernel hello_world
+
+   hello_world:
+
+      .amd_kernel_code_t
+         enable_sgpr_kernarg_segment_ptr = 1
+         is_ptr64 = 1
+         compute_pgm_rsrc1_vgprs = 0
+         compute_pgm_rsrc1_sgprs = 0
+         compute_pgm_rsrc2_user_sgpr = 2
+         compute_pgm_rsrc1_wgp_mode = 0
+         compute_pgm_rsrc1_mem_ordered = 0
+         compute_pgm_rsrc1_fwd_progress = 1
+     .end_amd_kernel_code_t
+
+     s_load_dwordx2 s[0:1], s[0:1] 0x0
+     v_mov_b32 v0, 3.14159
+     s_waitcnt lgkmcnt(0)
+     v_mov_b32 v1, s0
+     v_mov_b32 v2, s1
+     flat_store_dword v[1:2], v0
+     s_endpgm
+   .Lfunc_end0:
+        .size   hello_world, .Lfunc_end0-hello_world
+
+.. _amdgpu-amdhsa-assembler-predefined-symbols-v3:
+
+Code Object V3 Predefined Symbols (-mattr=+code-object-v3)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The AMDGPU assembler defines and updates some symbols automatically. These
+symbols do not affect code generation.
+
+.amdgcn.gfx_generation_number
++++++++++++++++++++++++++++++
+
+Set to the GFX major generation number of the target being assembled for. For
+example, when assembling for a "GFX9" target this will be set to the integer
+value "9". The possible GFX major generation numbers are presented in
+:ref:`amdgpu-processors`.
+
+.amdgcn.gfx_generation_minor
+++++++++++++++++++++++++++++
+
+Set to the GFX minor generation number of the target being assembled for. For
+example, when assembling for a "GFX810" target this will be set to the integer
+value "1". The possible GFX minor generation numbers are presented in
+:ref:`amdgpu-processors`.
+
+.amdgcn.gfx_generation_stepping
++++++++++++++++++++++++++++++++
+
+Set to the GFX stepping generation number of the target being assembled for.
+For example, when assembling for a "GFX704" target this will be set to the
+integer value "4". The possible GFX stepping generation numbers are presented
+in :ref:`amdgpu-processors`.
+
+.. _amdgpu-amdhsa-assembler-symbol-next_free_vgpr:
+
+.amdgcn.next_free_vgpr
+++++++++++++++++++++++
+
+Set to zero before assembly begins. At each instruction, if the current value
+of this symbol is less than or equal to the maximum VGPR number explicitly
+referenced within that instruction then the symbol value is updated to equal
+that VGPR number plus one.
+
+May be used to set the `.amdhsa_next_free_vpgr` directive in
+:ref:`amdhsa-kernel-directives-table`.
+
+May be set at any time, e.g. manually set to zero at the start of each kernel.
+
+.. _amdgpu-amdhsa-assembler-symbol-next_free_sgpr:
+
+.amdgcn.next_free_sgpr
+++++++++++++++++++++++
+
+Set to zero before assembly begins. At each instruction, if the current value
+of this symbol is less than or equal the maximum SGPR number explicitly
+referenced within that instruction then the symbol value is updated to equal
+that SGPR number plus one.
+
+May be used to set the `.amdhsa_next_free_spgr` directive in
+:ref:`amdhsa-kernel-directives-table`.
+
+May be set at any time, e.g. manually set to zero at the start of each kernel.
+
+.. _amdgpu-amdhsa-assembler-directives-v3:
+
+Code Object V3 Directives (-mattr=+code-object-v3)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Directives which begin with ``.amdgcn`` are valid for all ``amdgcn``
+architecture processors, and are not OS-specific. Directives which begin with
+``.amdhsa`` are specific to ``amdgcn`` architecture processors when the
+``amdhsa`` OS is specified. See :ref:`amdgpu-target-triples` and
+:ref:`amdgpu-processors`.
+
+.amdgcn_target <target>
++++++++++++++++++++++++
+
+Optional directive which declares the target supported by the containing
+assembler source file. Valid values are described in
+:ref:`amdgpu-amdhsa-code-object-target-identification`. Used by the assembler
+to validate command-line options such as ``-triple``, ``-mcpu``, and those
+which specify target features.
+
+.amdhsa_kernel <name>
++++++++++++++++++++++
+
+Creates a correctly aligned AMDHSA kernel descriptor and a symbol,
+``<name>.kd``, in the current location of the current section. Only valid when
+the OS is ``amdhsa``. ``<name>`` must be a symbol that labels the first
+instruction to execute, and does not need to be previously defined.
+
+Marks the beginning of a list of directives used to generate the bytes of a
+kernel descriptor, as described in :ref:`amdgpu-amdhsa-kernel-descriptor`.
+Directives which may appear in this list are described in
+:ref:`amdhsa-kernel-directives-table`. Directives may appear in any order, must
+be valid for the target being assembled for, and cannot be repeated. Directives
+support the range of values specified by the field they reference in
+:ref:`amdgpu-amdhsa-kernel-descriptor`. If a directive is not specified, it is
+assumed to have its default value, unless it is marked as "Required", in which
+case it is an error to omit the directive. This list of directives is
+terminated by an ``.end_amdhsa_kernel`` directive.
+
+  .. table:: AMDHSA Kernel Assembler Directives
+     :name: amdhsa-kernel-directives-table
+
+     ======================================================== =================== ============ ===================
+     Directive                                                Default             Supported On Description
+     ======================================================== =================== ============ ===================
+     ``.amdhsa_group_segment_fixed_size``                     0                   GFX6-GFX10   Controls GROUP_SEGMENT_FIXED_SIZE in
+                                                                                               :ref:`amdgpu-amdhsa-kernel-descriptor-gfx6-gfx10-table`.
+     ``.amdhsa_private_segment_fixed_size``                   0                   GFX6-GFX10   Controls PRIVATE_SEGMENT_FIXED_SIZE in
+                                                                                               :ref:`amdgpu-amdhsa-kernel-descriptor-gfx6-gfx10-table`.
+     ``.amdhsa_user_sgpr_private_segment_buffer``             0                   GFX6-GFX10   Controls ENABLE_SGPR_PRIVATE_SEGMENT_BUFFER in
+                                                                                               :ref:`amdgpu-amdhsa-kernel-descriptor-gfx6-gfx10-table`.
+     ``.amdhsa_user_sgpr_dispatch_ptr``                       0                   GFX6-GFX10   Controls ENABLE_SGPR_DISPATCH_PTR in
+                                                                                               :ref:`amdgpu-amdhsa-kernel-descriptor-gfx6-gfx10-table`.
+     ``.amdhsa_user_sgpr_queue_ptr``                          0                   GFX6-GFX10   Controls ENABLE_SGPR_QUEUE_PTR in
+                                                                                               :ref:`amdgpu-amdhsa-kernel-descriptor-gfx6-gfx10-table`.
+     ``.amdhsa_user_sgpr_kernarg_segment_ptr``                0                   GFX6-GFX10   Controls ENABLE_SGPR_KERNARG_SEGMENT_PTR in
+                                                                                               :ref:`amdgpu-amdhsa-kernel-descriptor-gfx6-gfx10-table`.
+     ``.amdhsa_user_sgpr_dispatch_id``                        0                   GFX6-GFX10   Controls ENABLE_SGPR_DISPATCH_ID in
+                                                                                               :ref:`amdgpu-amdhsa-kernel-descriptor-gfx6-gfx10-table`.
+     ``.amdhsa_user_sgpr_flat_scratch_init``                  0                   GFX6-GFX10   Controls ENABLE_SGPR_FLAT_SCRATCH_INIT in
+                                                                                               :ref:`amdgpu-amdhsa-kernel-descriptor-gfx6-gfx10-table`.
+     ``.amdhsa_user_sgpr_private_segment_size``               0                   GFX6-GFX10   Controls ENABLE_SGPR_PRIVATE_SEGMENT_SIZE in
+                                                                                               :ref:`amdgpu-amdhsa-kernel-descriptor-gfx6-gfx10-table`.
+     ``.amdhsa_wavefront_size32``                             Target              GFX10        Controls ENABLE_WAVEFRONT_SIZE32 in
+                                                              Feature                          :ref:`amdgpu-amdhsa-kernel-descriptor-gfx6-gfx10-table`.
+                                                              Specific
+                                                              (-wavefrontsize64)
+     ``.amdhsa_system_sgpr_private_segment_wavefront_offset`` 0                   GFX6-GFX10   Controls ENABLE_SGPR_PRIVATE_SEGMENT_WAVEFRONT_OFFSET in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx10-table`.
+     ``.amdhsa_system_sgpr_workgroup_id_x``                   1                   GFX6-GFX10   Controls ENABLE_SGPR_WORKGROUP_ID_X in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx10-table`.
+     ``.amdhsa_system_sgpr_workgroup_id_y``                   0                   GFX6-GFX10   Controls ENABLE_SGPR_WORKGROUP_ID_Y in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx10-table`.
+     ``.amdhsa_system_sgpr_workgroup_id_z``                   0                   GFX6-GFX10   Controls ENABLE_SGPR_WORKGROUP_ID_Z in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx10-table`.
+     ``.amdhsa_system_sgpr_workgroup_info``                   0                   GFX6-GFX10   Controls ENABLE_SGPR_WORKGROUP_INFO in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx10-table`.
+     ``.amdhsa_system_vgpr_workitem_id``                      0                   GFX6-GFX10   Controls ENABLE_VGPR_WORKITEM_ID in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx10-table`.
+                                                                                               Possible values are defined in
+                                                                                               :ref:`amdgpu-amdhsa-system-vgpr-work-item-id-enumeration-values-table`.
+     ``.amdhsa_next_free_vgpr``                               Required            GFX6-GFX10   Maximum VGPR number explicitly referenced, plus one.
+                                                                                               Used to calculate GRANULATED_WORKITEM_VGPR_COUNT in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`.
+     ``.amdhsa_next_free_sgpr``                               Required            GFX6-GFX10   Maximum SGPR number explicitly referenced, plus one.
+                                                                                               Used to calculate GRANULATED_WAVEFRONT_SGPR_COUNT in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`.
+     ``.amdhsa_reserve_vcc``                                  1                   GFX6-GFX10   Whether the kernel may use the special VCC SGPR.
+                                                                                               Used to calculate GRANULATED_WAVEFRONT_SGPR_COUNT in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`.
+     ``.amdhsa_reserve_flat_scratch``                         1                   GFX7-GFX10   Whether the kernel may use flat instructions to access
+                                                                                               scratch memory. Used to calculate
+                                                                                               GRANULATED_WAVEFRONT_SGPR_COUNT in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`.
+     ``.amdhsa_reserve_xnack_mask``                           Target              GFX8-GFX10   Whether the kernel may trigger XNACK replay.
+                                                              Feature                          Used to calculate GRANULATED_WAVEFRONT_SGPR_COUNT in
+                                                              Specific                         :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`.
+                                                              (+xnack)
+     ``.amdhsa_float_round_mode_32``                          0                   GFX6-GFX10   Controls FLOAT_ROUND_MODE_32 in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`.
+                                                                                               Possible values are defined in
+                                                                                               :ref:`amdgpu-amdhsa-floating-point-rounding-mode-enumeration-values-table`.
+     ``.amdhsa_float_round_mode_16_64``                       0                   GFX6-GFX10   Controls FLOAT_ROUND_MODE_16_64 in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`.
+                                                                                               Possible values are defined in
+                                                                                               :ref:`amdgpu-amdhsa-floating-point-rounding-mode-enumeration-values-table`.
+     ``.amdhsa_float_denorm_mode_32``                         0                   GFX6-GFX10   Controls FLOAT_DENORM_MODE_32 in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`.
+                                                                                               Possible values are defined in
+                                                                                               :ref:`amdgpu-amdhsa-floating-point-denorm-mode-enumeration-values-table`.
+     ``.amdhsa_float_denorm_mode_16_64``                      3                   GFX6-GFX10   Controls FLOAT_DENORM_MODE_16_64 in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`.
+                                                                                               Possible values are defined in
+                                                                                               :ref:`amdgpu-amdhsa-floating-point-denorm-mode-enumeration-values-table`.
+     ``.amdhsa_dx10_clamp``                                   1                   GFX6-GFX10   Controls ENABLE_DX10_CLAMP in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`.
+     ``.amdhsa_ieee_mode``                                    1                   GFX6-GFX10   Controls ENABLE_IEEE_MODE in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`.
+     ``.amdhsa_fp16_overflow``                                0                   GFX9-GFX10   Controls FP16_OVFL in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`.
+     ``.amdhsa_workgroup_processor_mode``                     Target              GFX10        Controls ENABLE_WGP_MODE in
+                                                              Feature                          :ref:`amdgpu-amdhsa-kernel-descriptor-gfx6-gfx10-table`.
+                                                              Specific
+                                                              (-cumode)
+     ``.amdhsa_memory_ordered``                               1                   GFX10        Controls MEM_ORDERED in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`.
+     ``.amdhsa_forward_progress``                             0                   GFX10        Controls FWD_PROGRESS in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx10-table`.
+     ``.amdhsa_exception_fp_ieee_invalid_op``                 0                   GFX6-GFX10   Controls ENABLE_EXCEPTION_IEEE_754_FP_INVALID_OPERATION in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx10-table`.
+     ``.amdhsa_exception_fp_denorm_src``                      0                   GFX6-GFX10   Controls ENABLE_EXCEPTION_FP_DENORMAL_SOURCE in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx10-table`.
+     ``.amdhsa_exception_fp_ieee_div_zero``                   0                   GFX6-GFX10   Controls ENABLE_EXCEPTION_IEEE_754_FP_DIVISION_BY_ZERO in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx10-table`.
+     ``.amdhsa_exception_fp_ieee_overflow``                   0                   GFX6-GFX10   Controls ENABLE_EXCEPTION_IEEE_754_FP_OVERFLOW in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx10-table`.
+     ``.amdhsa_exception_fp_ieee_underflow``                  0                   GFX6-GFX10   Controls ENABLE_EXCEPTION_IEEE_754_FP_UNDERFLOW in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx10-table`.
+     ``.amdhsa_exception_fp_ieee_inexact``                    0                   GFX6-GFX10   Controls ENABLE_EXCEPTION_IEEE_754_FP_INEXACT in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx10-table`.
+     ``.amdhsa_exception_int_div_zero``                       0                   GFX6-GFX10   Controls ENABLE_EXCEPTION_INT_DIVIDE_BY_ZERO in
+                                                                                               :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx10-table`.
+     ======================================================== =================== ============ ===================
+
+.amdgpu_metadata
+++++++++++++++++
+
+Optional directive which declares the contents of the ``NT_AMDGPU_METADATA``
+note record (see :ref:`amdgpu-elf-note-records-table-v3`).
+
+The contents must be in the [YAML]_ markup format, with the same structure and
+semantics described in :ref:`amdgpu-amdhsa-code-object-metadata-v3`.
+
+This directive is terminated by an ``.end_amdgpu_metadata`` directive.
+
+.. _amdgpu-amdhsa-assembler-example-v3:
+
+Code Object V3 Example Source Code (-mattr=+code-object-v3)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Here is an example of a minimal assembly source file, defining one HSA kernel:
+
+.. code-block:: none
+
+  .amdgcn_target "amdgcn-amd-amdhsa--gfx900+xnack" // optional
+
+  .text
+  .globl hello_world
+  .p2align 8
+  .type hello_world, at function
+  hello_world:
+    s_load_dwordx2 s[0:1], s[0:1] 0x0
+    v_mov_b32 v0, 3.14159
+    s_waitcnt lgkmcnt(0)
+    v_mov_b32 v1, s0
+    v_mov_b32 v2, s1
+    flat_store_dword v[1:2], v0
+    s_endpgm
+  .Lfunc_end0:
+    .size   hello_world, .Lfunc_end0-hello_world
+
+  .rodata
+  .p2align 6
+  .amdhsa_kernel hello_world
+    .amdhsa_user_sgpr_kernarg_segment_ptr 1
+    .amdhsa_next_free_vgpr .amdgcn.next_free_vgpr
+    .amdhsa_next_free_sgpr .amdgcn.next_free_sgpr
+  .end_amdhsa_kernel
+
+  .amdgpu_metadata
+  ---
+  amdhsa.version:
+    - 1
+    - 0
+  amdhsa.kernels:
+    - .name: hello_world
+      .symbol: hello_world.kd
+      .kernarg_segment_size: 48
+      .group_segment_fixed_size: 0
+      .private_segment_fixed_size: 0
+      .kernarg_segment_align: 4
+      .wavefront_size: 64
+      .sgpr_count: 2
+      .vgpr_count: 3
+      .max_flat_workgroup_size: 256
+  ...
+  .end_amdgpu_metadata
+
+If an assembly source file contains multiple kernels and/or functions, the
+:ref:`amdgpu-amdhsa-assembler-symbol-next_free_vgpr` and
+:ref:`amdgpu-amdhsa-assembler-symbol-next_free_sgpr` symbols may be reset using
+the ``.set <symbol>, <expression>`` directive. For example, in the case of two
+kernels, where ``function1`` is only called from ``kernel1`` it is sufficient
+to group the function with the kernel that calls it and reset the symbols
+between the two connected components:
+
+.. code-block:: none
+
+  .amdgcn_target "amdgcn-amd-amdhsa--gfx900+xnack" // optional
+
+  // gpr tracking symbols are implicitly set to zero
+
+  .text
+  .globl kern0
+  .p2align 8
+  .type kern0, at function
+  kern0:
+    // ...
+    s_endpgm
+  .Lkern0_end:
+    .size   kern0, .Lkern0_end-kern0
+
+  .rodata
+  .p2align 6
+  .amdhsa_kernel kern0
+    // ...
+    .amdhsa_next_free_vgpr .amdgcn.next_free_vgpr
+    .amdhsa_next_free_sgpr .amdgcn.next_free_sgpr
+  .end_amdhsa_kernel
+
+  // reset symbols to begin tracking usage in func1 and kern1
+  .set .amdgcn.next_free_vgpr, 0
+  .set .amdgcn.next_free_sgpr, 0
+
+  .text
+  .hidden func1
+  .global func1
+  .p2align 2
+  .type func1, at function
+  func1:
+    // ...
+    s_setpc_b64 s[30:31]
+  .Lfunc1_end:
+  .size func1, .Lfunc1_end-func1
+
+  .globl kern1
+  .p2align 8
+  .type kern1, at function
+  kern1:
+    // ...
+    s_getpc_b64 s[4:5]
+    s_add_u32 s4, s4, func1 at rel32@lo+4
+    s_addc_u32 s5, s5, func1 at rel32@lo+4
+    s_swappc_b64 s[30:31], s[4:5]
+    // ...
+    s_endpgm
+  .Lkern1_end:
+    .size   kern1, .Lkern1_end-kern1
+
+  .rodata
+  .p2align 6
+  .amdhsa_kernel kern1
+    // ...
+    .amdhsa_next_free_vgpr .amdgcn.next_free_vgpr
+    .amdhsa_next_free_sgpr .amdgcn.next_free_sgpr
+  .end_amdhsa_kernel
+
+These symbols cannot identify connected components in order to automatically
+track the usage for each kernel. However, in some cases careful organization of
+the kernels and functions in the source file means there is minimal additional
+effort required to accurately calculate GPR usage.
+
+Additional Documentation
+========================
+
+.. [AMD-RADEON-HD-2000-3000] `AMD R6xx shader ISA <http://developer.amd.com/wordpress/media/2012/10/R600_Instruction_Set_Architecture.pdf>`__
+.. [AMD-RADEON-HD-4000] `AMD R7xx shader ISA <http://developer.amd.com/wordpress/media/2012/10/R700-Family_Instruction_Set_Architecture.pdf>`__
+.. [AMD-RADEON-HD-5000] `AMD Evergreen shader ISA <http://developer.amd.com/wordpress/media/2012/10/AMD_Evergreen-Family_Instruction_Set_Architecture.pdf>`__
+.. [AMD-RADEON-HD-6000] `AMD Cayman/Trinity shader ISA <http://developer.amd.com/wordpress/media/2012/10/AMD_HD_6900_Series_Instruction_Set_Architecture.pdf>`__
+.. [AMD-GCN-GFX6] `AMD Southern Islands Series ISA <http://developer.amd.com/wordpress/media/2012/12/AMD_Southern_Islands_Instruction_Set_Architecture.pdf>`__
+.. [AMD-GCN-GFX7] `AMD Sea Islands Series ISA <http://developer.amd.com/wordpress/media/2013/07/AMD_Sea_Islands_Instruction_Set_Architecture.pdf>`_
+.. [AMD-GCN-GFX8] `AMD GCN3 Instruction Set Architecture <http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf>`__
+.. [AMD-GCN-GFX9] `AMD "Vega" Instruction Set Architecture <http://developer.amd.com/wordpress/media/2013/12/Vega_Shader_ISA_28July2017.pdf>`__
+.. [AMD-GCN-GFX10] AMD "Navi" Instruction Set Architecture *TBA*
+.. TODO
+   ttye Add link when made public.
+.. [AMD-ROCm] `ROCm: Open Platform for Development, Discovery and Education Around GPU Computing <http://gpuopen.com/compute-product/rocm/>`__
+.. [AMD-ROCm-github] `ROCm github <http://github.com/RadeonOpenCompute>`__
+.. [HSA] `Heterogeneous System Architecture (HSA) Foundation <http://www.hsafoundation.com/>`__
+.. [ELF] `Executable and Linkable Format (ELF) <http://www.sco.com/developers/gabi/>`__
+.. [DWARF] `DWARF Debugging Information Format <http://dwarfstd.org/>`__
+.. [YAML] `YAML Ain't Markup Language (YAML™) Version 1.2 <http://www.yaml.org/spec/1.2/spec.html>`__
+.. [MsgPack] `Message Pack <http://www.msgpack.org/>`__
+.. [OpenCL] `The OpenCL Specification Version 2.0 <http://www.khronos.org/registry/cl/specs/opencl-2.0.pdf>`__
+.. [HRF] `Heterogeneous-race-free Memory Models <http://benedictgaster.org/wp-content/uploads/2014/01/asplos269-FINAL.pdf>`__
+.. [CLANG-ATTR] `Attributes in Clang <http://clang.llvm.org/docs/AttributeReference.html>`__

Added: www-releases/trunk/9.0.0/docs/_sources/AddingConstrainedIntrinsics.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AddingConstrainedIntrinsics.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AddingConstrainedIntrinsics.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AddingConstrainedIntrinsics.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,102 @@
+==================================================
+How To Add A Constrained Floating-Point Intrinsic
+==================================================
+
+.. contents::
+   :local:
+
+.. warning::
+  This is a work in progress.
+
+Add the intrinsic
+=================
+
+Multiple files need to be updated when adding a new constrained intrinsic.
+
+Add the new intrinsic to the table of intrinsics.::
+
+  include/llvm/IR/Intrinsics.td
+
+Update class ConstrainedFPIntrinsic to know about the intrinsics.::
+
+  include/llvm/IR/IntrinsicInst.h
+
+Functions like ConstrainedFPIntrinsic::isUnaryOp() or
+ConstrainedFPIntrinsic::isTernaryOp() may need to know about the new
+intrinsic.::
+
+  lib/IR/IntrinsicInst.cpp
+
+Update the IR verifier::
+
+  lib/IR/Verifier.cpp
+
+Add SelectionDAG node types
+===========================
+
+Add the new STRICT version of the node type to the ISD::NodeType enum.::
+
+  include/llvm/CodeGen/ISDOpcodes.h
+
+In class SDNode update isStrictFPOpcode()::
+
+  include/llvm/CodeGen/SelectionDAGNodes.h
+
+A mapping from the STRICT SDnode type to the non-STRICT is done in
+TargetLoweringBase::getStrictFPOperationAction(). This allows STRICT
+nodes to be legalized similarly to the non-STRICT node type.::
+
+  include/llvm/CodeGen/TargetLowering.h
+
+Building the SelectionDAG
+-------------------------
+
+The switch statement in SelectionDAGBuilder::visitIntrinsicCall() needs
+to be updated to call SelectionDAGBuilder::visitConstrainedFPIntrinsic().
+That function, in turn, needs to be updated to know how to create the
+SDNode for the intrinsic. The new STRICT node will eventually be converted
+to the matching non-STRICT node. For this reason it should have the same
+operands and values as the non-STRICT version but should also use the chain.
+This makes subsequent sharing of code for STRICT and non-STRICT code paths
+easier.::
+
+  lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+
+Most of the STRICT nodes get legalized the same as their matching non-STRICT
+counterparts. A new STRICT node with this property must get added to the
+switch in SelectionDAGLegalize::LegalizeOp().::
+
+  lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
+
+Other parts of the legalizer may need to be updated as well. Look for
+places where the non-STRICT counterpart is legalized and update as needed.
+Be careful of the chain since STRICT nodes use it but their counterparts
+often don't.
+
+The code to do the conversion or mutation of the STRICT node to a non-STRICT
+version of the node happens in SelectionDAG::mutateStrictFPToFP(). Be
+careful updating this function since some nodes have the same return type
+as their input operand, but some are different. Both of these cases must
+be properly handled.::
+
+  lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+
+However, the mutation may not happen if the new node has not been registered
+in TargetLoweringBase::initActions(). If the corresponding non-STRICT node
+is Legal but a target does not know about STRICT nodes then the STRICT
+node will default to Legal and mutation will be bypassed with a "Cannot
+select" error. Register the new STRICT node as Expand to avoid this bug.::
+
+  lib/CodeGen/TargetLoweringBase.cpp
+
+To make debug logs readable it is helpful to update the SelectionDAG's
+debug logger:::
+
+  lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
+
+Add documentation and tests
+===========================
+
+::
+
+  docs/LangRef.rst

Added: www-releases/trunk/9.0.0/docs/_sources/AdvancedBuilds.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AdvancedBuilds.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AdvancedBuilds.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AdvancedBuilds.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,193 @@
+=============================
+Advanced Build Configurations
+=============================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+`CMake <http://www.cmake.org/>`_ is a cross-platform build-generator tool. CMake
+does not build the project, it generates the files needed by your build tool
+(GNU make, Visual Studio, etc.) for building LLVM.
+
+If **you are a new contributor**, please start with the :doc:`GettingStarted` or
+:doc:`CMake` pages. This page is intended for users doing more complex builds.
+
+Many of the examples below are written assuming specific CMake Generators.
+Unless otherwise explicitly called out these commands should work with any CMake
+generator.
+
+Bootstrap Builds
+================
+
+The Clang CMake build system supports bootstrap (aka multi-stage) builds. At a
+high level a multi-stage build is a chain of builds that pass data from one
+stage into the next. The most common and simple version of this is a traditional
+bootstrap build.
+
+In a simple two-stage bootstrap build, we build clang using the system compiler,
+then use that just-built clang to build clang again. In CMake this simplest form
+of a bootstrap build can be configured with a single option,
+CLANG_ENABLE_BOOTSTRAP.
+
+.. code-block:: console
+
+  $ cmake -G Ninja -DCLANG_ENABLE_BOOTSTRAP=On <path to source>
+  $ ninja stage2
+
+This command itself isn't terribly useful because it assumes default
+configurations for each stage. The next series of examples utilize CMake cache
+scripts to provide more complex options.
+
+By default, only a few CMake options will be passed between stages.
+The list, called _BOOTSTRAP_DEFAULT_PASSTHROUGH, is defined in clang/CMakeLists.txt.
+To force the passing of the variables between stages, use the -DCLANG_BOOTSTRAP_PASSTHROUGH
+CMake option, each variable separated by a ";". As example:
+
+.. code-block:: console
+
+  $ cmake -G Ninja -DCLANG_ENABLE_BOOTSTRAP=On -DCLANG_BOOTSTRAP_PASSTHROUGH="CMAKE_INSTALL_PREFIX;CMAKE_VERBOSE_MAKEFILE" <path to source>
+  $ ninja stage2
+
+CMake options starting by ``BOOTSTRAP_`` will be passed only to the stage2 build.
+This gives the opportunity to use Clang specific build flags.
+For example, the following CMake call will enabled '-fno-addrsig' only during
+the stage2 build for C and C++.
+
+.. code-block:: console
+
+  $ cmake [..]  -DBOOTSTRAP_CMAKE_CXX_FLAGS='-fno-addrsig' -DBOOTSTRAP_CMAKE_C_FLAGS='-fno-addrsig' [..]
+
+The clang build system refers to builds as stages. A stage1 build is a standard
+build using the compiler installed on the host, and a stage2 build is built
+using the stage1 compiler. This nomenclature holds up to more stages too. In
+general a stage*n* build is built using the output from stage*n-1*.
+
+Apple Clang Builds (A More Complex Bootstrap)
+=============================================
+
+Apple's Clang builds are a slightly more complicated example of the simple
+bootstrapping scenario. Apple Clang is built using a 2-stage build.
+
+The stage1 compiler is a host-only compiler with some options set. The stage1
+compiler is a balance of optimization vs build time because it is a throwaway.
+The stage2 compiler is the fully optimized compiler intended to ship to users.
+
+Setting up these compilers requires a lot of options. To simplify the
+configuration the Apple Clang build settings are contained in CMake Cache files.
+You can build an Apple Clang compiler using the following commands:
+
+.. code-block:: console
+
+  $ cmake -G Ninja -C <path to clang>/cmake/caches/Apple-stage1.cmake <path to source>
+  $ ninja stage2-distribution
+
+This CMake invocation configures the stage1 host compiler, and sets
+CLANG_BOOTSTRAP_CMAKE_ARGS to pass the Apple-stage2.cmake cache script to the
+stage2 configuration step.
+
+When you build the stage2-distribution target it builds the minimal stage1
+compiler and required tools, then configures and builds the stage2 compiler
+based on the settings in Apple-stage2.cmake.
+
+This pattern of using cache scripts to set complex settings, and specifically to
+make later stage builds include cache scripts is common in our more advanced
+build configurations.
+
+Multi-stage PGO
+===============
+
+Profile-Guided Optimizations (PGO) is a really great way to optimize the code
+clang generates. Our multi-stage PGO builds are a workflow for generating PGO
+profiles that can be used to optimize clang.
+
+At a high level, the way PGO works is that you build an instrumented compiler,
+then you run the instrumented compiler against sample source files. While the
+instrumented compiler runs it will output a bunch of files containing
+performance counters (.profraw files). After generating all the profraw files
+you use llvm-profdata to merge the files into a single profdata file that you
+can feed into the LLVM_PROFDATA_FILE option.
+
+Our PGO.cmake cache script automates that whole process. You can use it by
+running:
+
+.. code-block:: console
+
+  $ cmake -G Ninja -C <path_to_clang>/cmake/caches/PGO.cmake <source dir>
+  $ ninja stage2-instrumented-generate-profdata
+
+If you let that run for a few hours or so, it will place a profdata file in your
+build directory. This takes a really long time because it builds clang twice,
+and you *must* have compiler-rt in your build tree.
+
+This process uses any source files under the perf-training directory as training
+data as long as the source files are marked up with LIT-style RUN lines.
+
+After it finishes you can use “find . -name clang.profdata” to find it, but it
+should be at a path something like:
+
+.. code-block:: console
+
+  <build dir>/tools/clang/stage2-instrumented-bins/utils/perf-training/clang.profdata
+
+You can feed that file into the LLVM_PROFDATA_FILE option when you build your
+optimized compiler.
+
+The PGO came cache has a slightly different stage naming scheme than other
+multi-stage builds. It generates three stages; stage1, stage2-instrumented, and
+stage2. Both of the stage2 builds are built using the stage1 compiler.
+
+The PGO came cache generates the following additional targets:
+
+**stage2-instrumented**
+  Builds a stage1 x86 compiler, runtime, and required tools (llvm-config,
+  llvm-profdata) then uses that compiler to build an instrumented stage2 compiler.
+
+**stage2-instrumented-generate-profdata**
+  Depends on "stage2-instrumented" and will use the instrumented compiler to
+  generate profdata based on the training files in <clang>/utils/perf-training
+
+**stage2**
+  Depends of "stage2-instrumented-generate-profdata" and will use the stage1
+  compiler with the stage2 profdata to build a PGO-optimized compiler.
+
+**stage2-check-llvm**
+  Depends on stage2 and runs check-llvm using the stage2 compiler.
+
+**stage2-check-clang**
+  Depends on stage2 and runs check-clang using the stage2 compiler.
+
+**stage2-check-all**
+  Depends on stage2 and runs check-all using the stage2 compiler.
+
+**stage2-test-suite**
+  Depends on stage2 and runs the test-suite using the stage3 compiler (requires
+  in-tree test-suite).
+
+3-Stage Non-Determinism
+=======================
+
+In the ancient lore of compilers non-determinism is like the multi-headed hydra.
+Whenever its head pops up, terror and chaos ensue.
+
+Historically one of the tests to verify that a compiler was deterministic would
+be a three stage build. The idea of a three stage build is you take your sources
+and build a compiler (stage1), then use that compiler to rebuild the sources
+(stage2), then you use that compiler to rebuild the sources a third time
+(stage3) with an identical configuration to the stage2 build. At the end of
+this, you have a stage2 and stage3 compiler that should be bit-for-bit
+identical.
+
+You can perform one of these 3-stage builds with LLVM & clang using the
+following commands:
+
+.. code-block:: console
+
+  $ cmake -G Ninja -C <path_to_clang>/cmake/caches/3-stage.cmake <source dir>
+  $ ninja stage3
+
+After the build you can compare the stage2 & stage3 compilers. We have a bot
+setup `here <http://lab.llvm.org:8011/builders/clang-3stage-ubuntu>`_ that runs
+this build and compare configuration.

Added: www-releases/trunk/9.0.0/docs/_sources/AliasAnalysis.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/AliasAnalysis.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/AliasAnalysis.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/AliasAnalysis.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,691 @@
+==================================
+LLVM Alias Analysis Infrastructure
+==================================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+Alias Analysis (aka Pointer Analysis) is a class of techniques which attempt to
+determine whether or not two pointers ever can point to the same object in
+memory.  There are many different algorithms for alias analysis and many
+different ways of classifying them: flow-sensitive vs. flow-insensitive,
+context-sensitive vs. context-insensitive, field-sensitive
+vs. field-insensitive, unification-based vs. subset-based, etc.  Traditionally,
+alias analyses respond to a query with a `Must, May, or No`_ alias response,
+indicating that two pointers always point to the same object, might point to the
+same object, or are known to never point to the same object.
+
+The LLVM `AliasAnalysis
+<http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`__ class is the
+primary interface used by clients and implementations of alias analyses in the
+LLVM system.  This class is the common interface between clients of alias
+analysis information and the implementations providing it, and is designed to
+support a wide range of implementations and clients (but currently all clients
+are assumed to be flow-insensitive).  In addition to simple alias analysis
+information, this class exposes Mod/Ref information from those implementations
+which can provide it, allowing for powerful analyses and transformations to work
+well together.
+
+This document contains information necessary to successfully implement this
+interface, use it, and to test both sides.  It also explains some of the finer
+points about what exactly results mean.  
+
+``AliasAnalysis`` Class Overview
+================================
+
+The `AliasAnalysis <http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`__
+class defines the interface that the various alias analysis implementations
+should support.  This class exports two important enums: ``AliasResult`` and
+``ModRefResult`` which represent the result of an alias query or a mod/ref
+query, respectively.
+
+The ``AliasAnalysis`` interface exposes information about memory, represented in
+several different ways.  In particular, memory objects are represented as a
+starting address and size, and function calls are represented as the actual
+``call`` or ``invoke`` instructions that performs the call.  The
+``AliasAnalysis`` interface also exposes some helper methods which allow you to
+get mod/ref information for arbitrary instructions.
+
+All ``AliasAnalysis`` interfaces require that in queries involving multiple
+values, values which are not :ref:`constants <constants>` are all
+defined within the same function.
+
+Representation of Pointers
+--------------------------
+
+Most importantly, the ``AliasAnalysis`` class provides several methods which are
+used to query whether or not two memory objects alias, whether function calls
+can modify or read a memory object, etc.  For all of these queries, memory
+objects are represented as a pair of their starting address (a symbolic LLVM
+``Value*``) and a static size.
+
+Representing memory objects as a starting address and a size is critically
+important for correct Alias Analyses.  For example, consider this (silly, but
+possible) C code:
+
+.. code-block:: c++
+
+  int i;
+  char C[2];
+  char A[10]; 
+  /* ... */
+  for (i = 0; i != 10; ++i) {
+    C[0] = A[i];          /* One byte store */
+    C[1] = A[9-i];        /* One byte store */
+  }
+
+In this case, the ``basicaa`` pass will disambiguate the stores to ``C[0]`` and
+``C[1]`` because they are accesses to two distinct locations one byte apart, and
+the accesses are each one byte.  In this case, the Loop Invariant Code Motion
+(LICM) pass can use store motion to remove the stores from the loop.  In
+constrast, the following code:
+
+.. code-block:: c++
+
+  int i;
+  char C[2];
+  char A[10]; 
+  /* ... */
+  for (i = 0; i != 10; ++i) {
+    ((short*)C)[0] = A[i];  /* Two byte store! */
+    C[1] = A[9-i];          /* One byte store */
+  }
+
+In this case, the two stores to C do alias each other, because the access to the
+``&C[0]`` element is a two byte access.  If size information wasn't available in
+the query, even the first case would have to conservatively assume that the
+accesses alias.
+
+.. _alias:
+
+The ``alias`` method
+--------------------
+  
+The ``alias`` method is the primary interface used to determine whether or not
+two memory objects alias each other.  It takes two memory objects as input and
+returns MustAlias, PartialAlias, MayAlias, or NoAlias as appropriate.
+
+Like all ``AliasAnalysis`` interfaces, the ``alias`` method requires that either
+the two pointer values be defined within the same function, or at least one of
+the values is a :ref:`constant <constants>`.
+
+.. _Must, May, or No:
+
+Must, May, and No Alias Responses
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``NoAlias`` response may be used when there is never an immediate dependence
+between any memory reference *based* on one pointer and any memory reference
+*based* the other. The most obvious example is when the two pointers point to
+non-overlapping memory ranges. Another is when the two pointers are only ever
+used for reading memory. Another is when the memory is freed and reallocated
+between accesses through one pointer and accesses through the other --- in this
+case, there is a dependence, but it's mediated by the free and reallocation.
+
+As an exception to this is with the :ref:`noalias <noalias>` keyword;
+the "irrelevant" dependencies are ignored.
+
+The ``MayAlias`` response is used whenever the two pointers might refer to the
+same object.
+
+The ``PartialAlias`` response is used when the two memory objects are known to
+be overlapping in some way, regardless whether they start at the same address
+or not.
+
+The ``MustAlias`` response may only be returned if the two memory objects are
+guaranteed to always start at exactly the same location. A ``MustAlias``
+response does not imply that the pointers compare equal.
+
+The ``getModRefInfo`` methods
+-----------------------------
+
+The ``getModRefInfo`` methods return information about whether the execution of
+an instruction can read or modify a memory location.  Mod/Ref information is
+always conservative: if an instruction **might** read or write a location,
+``ModRef`` is returned.
+
+The ``AliasAnalysis`` class also provides a ``getModRefInfo`` method for testing
+dependencies between function calls.  This method takes two call sites (``CS1``
+& ``CS2``), returns ``NoModRef`` if neither call writes to memory read or
+written by the other, ``Ref`` if ``CS1`` reads memory written by ``CS2``,
+``Mod`` if ``CS1`` writes to memory read or written by ``CS2``, or ``ModRef`` if
+``CS1`` might read or write memory written to by ``CS2``.  Note that this
+relation is not commutative.
+
+Other useful ``AliasAnalysis`` methods
+--------------------------------------
+
+Several other tidbits of information are often collected by various alias
+analysis implementations and can be put to good use by various clients.
+
+The ``pointsToConstantMemory`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``pointsToConstantMemory`` method returns true if and only if the analysis
+can prove that the pointer only points to unchanging memory locations
+(functions, constant global variables, and the null pointer).  This information
+can be used to refine mod/ref information: it is impossible for an unchanging
+memory location to be modified.
+
+.. _never access memory or only read memory:
+
+The ``doesNotAccessMemory`` and  ``onlyReadsMemory`` methods
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+These methods are used to provide very simple mod/ref information for function
+calls.  The ``doesNotAccessMemory`` method returns true for a function if the
+analysis can prove that the function never reads or writes to memory, or if the
+function only reads from constant memory.  Functions with this property are
+side-effect free and only depend on their input arguments, allowing them to be
+eliminated if they form common subexpressions or be hoisted out of loops.  Many
+common functions behave this way (e.g., ``sin`` and ``cos``) but many others do
+not (e.g., ``acos``, which modifies the ``errno`` variable).
+
+The ``onlyReadsMemory`` method returns true for a function if analysis can prove
+that (at most) the function only reads from non-volatile memory.  Functions with
+this property are side-effect free, only depending on their input arguments and
+the state of memory when they are called.  This property allows calls to these
+functions to be eliminated and moved around, as long as there is no store
+instruction that changes the contents of memory.  Note that all functions that
+satisfy the ``doesNotAccessMemory`` method also satisfy ``onlyReadsMemory``.
+
+Writing a new ``AliasAnalysis`` Implementation
+==============================================
+
+Writing a new alias analysis implementation for LLVM is quite straight-forward.
+There are already several implementations that you can use for examples, and the
+following information should help fill in any details.  For a examples, take a
+look at the `various alias analysis implementations`_ included with LLVM.
+
+Different Pass styles
+---------------------
+
+The first step to determining what type of :doc:`LLVM pass <WritingAnLLVMPass>`
+you need to use for your Alias Analysis.  As is the case with most other
+analyses and transformations, the answer should be fairly obvious from what type
+of problem you are trying to solve:
+
+#. If you require interprocedural analysis, it should be a ``Pass``.
+#. If you are a function-local analysis, subclass ``FunctionPass``.
+#. If you don't need to look at the program at all, subclass ``ImmutablePass``.
+
+In addition to the pass that you subclass, you should also inherit from the
+``AliasAnalysis`` interface, of course, and use the ``RegisterAnalysisGroup``
+template to register as an implementation of ``AliasAnalysis``.
+
+Required initialization calls
+-----------------------------
+
+Your subclass of ``AliasAnalysis`` is required to invoke two methods on the
+``AliasAnalysis`` base class: ``getAnalysisUsage`` and
+``InitializeAliasAnalysis``.  In particular, your implementation of
+``getAnalysisUsage`` should explicitly call into the
+``AliasAnalysis::getAnalysisUsage`` method in addition to doing any declaring
+any pass dependencies your pass has.  Thus you should have something like this:
+
+.. code-block:: c++
+
+  void getAnalysisUsage(AnalysisUsage &AU) const {
+    AliasAnalysis::getAnalysisUsage(AU);
+    // declare your dependencies here.
+  }
+
+Additionally, your must invoke the ``InitializeAliasAnalysis`` method from your
+analysis run method (``run`` for a ``Pass``, ``runOnFunction`` for a
+``FunctionPass``, or ``InitializePass`` for an ``ImmutablePass``).  For example
+(as part of a ``Pass``):
+
+.. code-block:: c++
+
+  bool run(Module &M) {
+    InitializeAliasAnalysis(this);
+    // Perform analysis here...
+    return false;
+  }
+
+Required methods to override
+----------------------------
+
+You must override the ``getAdjustedAnalysisPointer`` method on all subclasses
+of ``AliasAnalysis``. An example implementation of this method would look like:
+
+.. code-block:: c++
+
+  void *getAdjustedAnalysisPointer(const void* ID) override {
+    if (ID == &AliasAnalysis::ID)
+      return (AliasAnalysis*)this;
+    return this;
+  }
+
+Interfaces which may be specified
+---------------------------------
+
+All of the `AliasAnalysis
+<http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`__ virtual methods
+default to providing :ref:`chaining <aliasanalysis-chaining>` to another alias
+analysis implementation, which ends up returning conservatively correct
+information (returning "May" Alias and "Mod/Ref" for alias and mod/ref queries
+respectively).  Depending on the capabilities of the analysis you are
+implementing, you just override the interfaces you can improve.
+
+.. _aliasanalysis-chaining:
+
+``AliasAnalysis`` chaining behavior
+-----------------------------------
+
+With only one special exception (the :ref:`-no-aa <aliasanalysis-no-aa>` pass)
+every alias analysis pass chains to another alias analysis implementation (for
+example, the user can specify "``-basicaa -ds-aa -licm``" to get the maximum
+benefit from both alias analyses).  The alias analysis class automatically
+takes care of most of this for methods that you don't override.  For methods
+that you do override, in code paths that return a conservative MayAlias or
+Mod/Ref result, simply return whatever the superclass computes.  For example:
+
+.. code-block:: c++
+
+  AliasResult alias(const Value *V1, unsigned V1Size,
+                    const Value *V2, unsigned V2Size) {
+    if (...)
+      return NoAlias;
+    ...
+
+    // Couldn't determine a must or no-alias result.
+    return AliasAnalysis::alias(V1, V1Size, V2, V2Size);
+  }
+
+In addition to analysis queries, you must make sure to unconditionally pass LLVM
+`update notification`_ methods to the superclass as well if you override them,
+which allows all alias analyses in a change to be updated.
+
+.. _update notification:
+
+Updating analysis results for transformations
+---------------------------------------------
+
+Alias analysis information is initially computed for a static snapshot of the
+program, but clients will use this information to make transformations to the
+code.  All but the most trivial forms of alias analysis will need to have their
+analysis results updated to reflect the changes made by these transformations.
+
+The ``AliasAnalysis`` interface exposes four methods which are used to
+communicate program changes from the clients to the analysis implementations.
+Various alias analysis implementations should use these methods to ensure that
+their internal data structures are kept up-to-date as the program changes (for
+example, when an instruction is deleted), and clients of alias analysis must be
+sure to call these interfaces appropriately.
+
+The ``deleteValue`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``deleteValue`` method is called by transformations when they remove an
+instruction or any other value from the program (including values that do not
+use pointers).  Typically alias analyses keep data structures that have entries
+for each value in the program.  When this method is called, they should remove
+any entries for the specified value, if they exist.
+
+The ``copyValue`` method
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``copyValue`` method is used when a new value is introduced into the
+program.  There is no way to introduce a value into the program that did not
+exist before (this doesn't make sense for a safe compiler transformation), so
+this is the only way to introduce a new value.  This method indicates that the
+new value has exactly the same properties as the value being copied.
+
+The ``replaceWithNewValue`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This method is a simple helper method that is provided to make clients easier to
+use.  It is implemented by copying the old analysis information to the new
+value, then deleting the old value.  This method cannot be overridden by alias
+analysis implementations.
+
+The ``addEscapingUse`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``addEscapingUse`` method is used when the uses of a pointer value have
+changed in ways that may invalidate precomputed analysis information.
+Implementations may either use this callback to provide conservative responses
+for points whose uses have change since analysis time, or may recompute some or
+all of their internal state to continue providing accurate responses.
+
+In general, any new use of a pointer value is considered an escaping use, and
+must be reported through this callback, *except* for the uses below:
+
+* A ``bitcast`` or ``getelementptr`` of the pointer
+* A ``store`` through the pointer (but not a ``store`` *of* the pointer)
+* A ``load`` through the pointer
+
+Efficiency Issues
+-----------------
+
+From the LLVM perspective, the only thing you need to do to provide an efficient
+alias analysis is to make sure that alias analysis **queries** are serviced
+quickly.  The actual calculation of the alias analysis results (the "run"
+method) is only performed once, but many (perhaps duplicate) queries may be
+performed.  Because of this, try to move as much computation to the run method
+as possible (within reason).
+
+Limitations
+-----------
+
+The AliasAnalysis infrastructure has several limitations which make writing a
+new ``AliasAnalysis`` implementation difficult.
+
+There is no way to override the default alias analysis. It would be very useful
+to be able to do something like "``opt -my-aa -O2``" and have it use ``-my-aa``
+for all passes which need AliasAnalysis, but there is currently no support for
+that, short of changing the source code and recompiling. Similarly, there is
+also no way of setting a chain of analyses as the default.
+
+There is no way for transform passes to declare that they preserve
+``AliasAnalysis`` implementations. The ``AliasAnalysis`` interface includes
+``deleteValue`` and ``copyValue`` methods which are intended to allow a pass to
+keep an AliasAnalysis consistent, however there's no way for a pass to declare
+in its ``getAnalysisUsage`` that it does so. Some passes attempt to use
+``AU.addPreserved<AliasAnalysis>``, however this doesn't actually have any
+effect.
+
+Similarly, the ``opt -p`` option introduces ``ModulePass`` passes between each
+pass, which prevents the use of ``FunctionPass`` alias analysis passes.
+
+The ``AliasAnalysis`` API does have functions for notifying implementations when
+values are deleted or copied, however these aren't sufficient. There are many
+other ways that LLVM IR can be modified which could be relevant to
+``AliasAnalysis`` implementations which can not be expressed.
+
+The ``AliasAnalysisDebugger`` utility seems to suggest that ``AliasAnalysis``
+implementations can expect that they will be informed of any relevant ``Value``
+before it appears in an alias query. However, popular clients such as ``GVN``
+don't support this, and are known to trigger errors when run with the
+``AliasAnalysisDebugger``.
+
+The ``AliasSetTracker`` class (which is used by ``LICM``) makes a
+non-deterministic number of alias queries. This can cause debugging techniques
+involving pausing execution after a predetermined number of queries to be
+unreliable.
+
+Many alias queries can be reformulated in terms of other alias queries. When
+multiple ``AliasAnalysis`` queries are chained together, it would make sense to
+start those queries from the beginning of the chain, with care taken to avoid
+infinite looping, however currently an implementation which wants to do this can
+only start such queries from itself.
+
+Using alias analysis results
+============================
+
+There are several different ways to use alias analysis results.  In order of
+preference, these are:
+
+Using the ``MemoryDependenceAnalysis`` Pass
+-------------------------------------------
+
+The ``memdep`` pass uses alias analysis to provide high-level dependence
+information about memory-using instructions.  This will tell you which store
+feeds into a load, for example.  It uses caching and other techniques to be
+efficient, and is used by Dead Store Elimination, GVN, and memcpy optimizations.
+
+.. _AliasSetTracker:
+
+Using the ``AliasSetTracker`` class
+-----------------------------------
+
+Many transformations need information about alias **sets** that are active in
+some scope, rather than information about pairwise aliasing.  The
+`AliasSetTracker <http://llvm.org/doxygen/classllvm_1_1AliasSetTracker.html>`__
+class is used to efficiently build these Alias Sets from the pairwise alias
+analysis information provided by the ``AliasAnalysis`` interface.
+
+First you initialize the AliasSetTracker by using the "``add``" methods to add
+information about various potentially aliasing instructions in the scope you are
+interested in.  Once all of the alias sets are completed, your pass should
+simply iterate through the constructed alias sets, using the ``AliasSetTracker``
+``begin()``/``end()`` methods.
+
+The ``AliasSet``\s formed by the ``AliasSetTracker`` are guaranteed to be
+disjoint, calculate mod/ref information and volatility for the set, and keep
+track of whether or not all of the pointers in the set are Must aliases.  The
+AliasSetTracker also makes sure that sets are properly folded due to call
+instructions, and can provide a list of pointers in each set.
+
+As an example user of this, the `Loop Invariant Code Motion
+<doxygen/structLICM.html>`_ pass uses ``AliasSetTracker``\s to calculate alias
+sets for each loop nest.  If an ``AliasSet`` in a loop is not modified, then all
+load instructions from that set may be hoisted out of the loop.  If any alias
+sets are stored to **and** are must alias sets, then the stores may be sunk
+to outside of the loop, promoting the memory location to a register for the
+duration of the loop nest.  Both of these transformations only apply if the
+pointer argument is loop-invariant.
+
+The AliasSetTracker implementation
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The AliasSetTracker class is implemented to be as efficient as possible.  It
+uses the union-find algorithm to efficiently merge AliasSets when a pointer is
+inserted into the AliasSetTracker that aliases multiple sets.  The primary data
+structure is a hash table mapping pointers to the AliasSet they are in.
+
+The AliasSetTracker class must maintain a list of all of the LLVM ``Value*``\s
+that are in each AliasSet.  Since the hash table already has entries for each
+LLVM ``Value*`` of interest, the AliasesSets thread the linked list through
+these hash-table nodes to avoid having to allocate memory unnecessarily, and to
+make merging alias sets extremely efficient (the linked list merge is constant
+time).
+
+You shouldn't need to understand these details if you are just a client of the
+AliasSetTracker, but if you look at the code, hopefully this brief description
+will help make sense of why things are designed the way they are.
+
+Using the ``AliasAnalysis`` interface directly
+----------------------------------------------
+
+If neither of these utility class are what your pass needs, you should use the
+interfaces exposed by the ``AliasAnalysis`` class directly.  Try to use the
+higher-level methods when possible (e.g., use mod/ref information instead of the
+`alias`_ method directly if possible) to get the best precision and efficiency.
+
+Existing alias analysis implementations and clients
+===================================================
+
+If you're going to be working with the LLVM alias analysis infrastructure, you
+should know what clients and implementations of alias analysis are available.
+In particular, if you are implementing an alias analysis, you should be aware of
+the `the clients`_ that are useful for monitoring and evaluating different
+implementations.
+
+.. _various alias analysis implementations:
+
+Available ``AliasAnalysis`` implementations
+-------------------------------------------
+
+This section lists the various implementations of the ``AliasAnalysis``
+interface.  With the exception of the :ref:`-no-aa <aliasanalysis-no-aa>`
+implementation, all of these :ref:`chain <aliasanalysis-chaining>` to other
+alias analysis implementations.
+
+.. _aliasanalysis-no-aa:
+
+The ``-no-aa`` pass
+^^^^^^^^^^^^^^^^^^^
+
+The ``-no-aa`` pass is just like what it sounds: an alias analysis that never
+returns any useful information.  This pass can be useful if you think that alias
+analysis is doing something wrong and are trying to narrow down a problem.
+
+The ``-basicaa`` pass
+^^^^^^^^^^^^^^^^^^^^^
+
+The ``-basicaa`` pass is an aggressive local analysis that *knows* many
+important facts:
+
+* Distinct globals, stack allocations, and heap allocations can never alias.
+* Globals, stack allocations, and heap allocations never alias the null pointer.
+* Different fields of a structure do not alias.
+* Indexes into arrays with statically differing subscripts cannot alias.
+* Many common standard C library functions `never access memory or only read
+  memory`_.
+* Pointers that obviously point to constant globals "``pointToConstantMemory``".
+* Function calls can not modify or references stack allocations if they never
+  escape from the function that allocates them (a common case for automatic
+  arrays).
+
+The ``-globalsmodref-aa`` pass
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This pass implements a simple context-sensitive mod/ref and alias analysis for
+internal global variables that don't "have their address taken".  If a global
+does not have its address taken, the pass knows that no pointers alias the
+global.  This pass also keeps track of functions that it knows never access
+memory or never read memory.  This allows certain optimizations (e.g. GVN) to
+eliminate call instructions entirely.
+
+The real power of this pass is that it provides context-sensitive mod/ref
+information for call instructions.  This allows the optimizer to know that calls
+to a function do not clobber or read the value of the global, allowing loads and
+stores to be eliminated.
+
+.. note::
+
+  This pass is somewhat limited in its scope (only support non-address taken
+  globals), but is very quick analysis.
+
+The ``-steens-aa`` pass
+^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``-steens-aa`` pass implements a variation on the well-known "Steensgaard's
+algorithm" for interprocedural alias analysis.  Steensgaard's algorithm is a
+unification-based, flow-insensitive, context-insensitive, and field-insensitive
+alias analysis that is also very scalable (effectively linear time).
+
+The LLVM ``-steens-aa`` pass implements a "speculatively field-**sensitive**"
+version of Steensgaard's algorithm using the Data Structure Analysis framework.
+This gives it substantially more precision than the standard algorithm while
+maintaining excellent analysis scalability.
+
+.. note::
+
+  ``-steens-aa`` is available in the optional "poolalloc" module. It is not part
+  of the LLVM core.
+
+The ``-ds-aa`` pass
+^^^^^^^^^^^^^^^^^^^
+
+The ``-ds-aa`` pass implements the full Data Structure Analysis algorithm.  Data
+Structure Analysis is a modular unification-based, flow-insensitive,
+context-**sensitive**, and speculatively field-**sensitive** alias
+analysis that is also quite scalable, usually at ``O(n * log(n))``.
+
+This algorithm is capable of responding to a full variety of alias analysis
+queries, and can provide context-sensitive mod/ref information as well.  The
+only major facility not implemented so far is support for must-alias
+information.
+
+.. note::
+
+  ``-ds-aa`` is available in the optional "poolalloc" module. It is not part of
+  the LLVM core.
+
+The ``-scev-aa`` pass
+^^^^^^^^^^^^^^^^^^^^^
+
+The ``-scev-aa`` pass implements AliasAnalysis queries by translating them into
+ScalarEvolution queries. This gives it a more complete understanding of
+``getelementptr`` instructions and loop induction variables than other alias
+analyses have.
+
+Alias analysis driven transformations
+-------------------------------------
+
+LLVM includes several alias-analysis driven transformations which can be used
+with any of the implementations above.
+
+The ``-adce`` pass
+^^^^^^^^^^^^^^^^^^
+
+The ``-adce`` pass, which implements Aggressive Dead Code Elimination uses the
+``AliasAnalysis`` interface to delete calls to functions that do not have
+side-effects and are not used.
+
+The ``-licm`` pass
+^^^^^^^^^^^^^^^^^^
+
+The ``-licm`` pass implements various Loop Invariant Code Motion related
+transformations.  It uses the ``AliasAnalysis`` interface for several different
+transformations:
+
+* It uses mod/ref information to hoist or sink load instructions out of loops if
+  there are no instructions in the loop that modifies the memory loaded.
+
+* It uses mod/ref information to hoist function calls out of loops that do not
+  write to memory and are loop-invariant.
+
+* It uses alias information to promote memory objects that are loaded and stored
+  to in loops to live in a register instead.  It can do this if there are no may
+  aliases to the loaded/stored memory location.
+
+The ``-argpromotion`` pass
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``-argpromotion`` pass promotes by-reference arguments to be passed in
+by-value instead.  In particular, if pointer arguments are only loaded from it
+passes in the value loaded instead of the address to the function.  This pass
+uses alias information to make sure that the value loaded from the argument
+pointer is not modified between the entry of the function and any load of the
+pointer.
+
+The ``-gvn``, ``-memcpyopt``, and ``-dse`` passes
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+These passes use AliasAnalysis information to reason about loads and stores.
+
+.. _the clients:
+
+Clients for debugging and evaluation of implementations
+-------------------------------------------------------
+
+These passes are useful for evaluating the various alias analysis
+implementations.  You can use them with commands like:
+
+.. code-block:: bash
+
+  % opt -ds-aa -aa-eval foo.bc -disable-output -stats
+
+The ``-print-alias-sets`` pass
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``-print-alias-sets`` pass is exposed as part of the ``opt`` tool to print
+out the Alias Sets formed by the `AliasSetTracker`_ class.  This is useful if
+you're using the ``AliasSetTracker`` class.  To use it, use something like:
+
+.. code-block:: bash
+
+  % opt -ds-aa -print-alias-sets -disable-output
+
+The ``-aa-eval`` pass
+^^^^^^^^^^^^^^^^^^^^^
+
+The ``-aa-eval`` pass simply iterates through all pairs of pointers in a
+function and asks an alias analysis whether or not the pointers alias.  This
+gives an indication of the precision of the alias analysis.  Statistics are
+printed indicating the percent of no/may/must aliases found (a more precise
+algorithm will have a lower number of may aliases).
+
+Memory Dependence Analysis
+==========================
+
+.. note::
+
+  We are currently in the process of migrating things from
+  ``MemoryDependenceAnalysis`` to :doc:`MemorySSA`. Please try to use
+  that instead.
+
+If you're just looking to be a client of alias analysis information, consider
+using the Memory Dependence Analysis interface instead.  MemDep is a lazy,
+caching layer on top of alias analysis that is able to answer the question of
+what preceding memory operations a given instruction depends on, either at an
+intra- or inter-block level.  Because of its laziness and caching policy, using
+MemDep can be a significant performance win over accessing alias analysis
+directly.

Added: www-releases/trunk/9.0.0/docs/_sources/Atomics.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/Atomics.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/Atomics.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/Atomics.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,621 @@
+==============================================
+LLVM Atomic Instructions and Concurrency Guide
+==============================================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+LLVM supports instructions which are well-defined in the presence of threads and
+asynchronous signals.
+
+The atomic instructions are designed specifically to provide readable IR and
+optimized code generation for the following:
+
+* The C++11 ``<atomic>`` header.  (`C++11 draft available here
+  <http://www.open-std.org/jtc1/sc22/wg21/>`_.) (`C11 draft available here
+  <http://www.open-std.org/jtc1/sc22/wg14/>`_.)
+
+* Proper semantics for Java-style memory, for both ``volatile`` and regular
+  shared variables. (`Java Specification
+  <http://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html>`_)
+
+* gcc-compatible ``__sync_*`` builtins. (`Description
+  <https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html>`_)
+
+* Other scenarios with atomic semantics, including ``static`` variables with
+  non-trivial constructors in C++.
+
+Atomic and volatile in the IR are orthogonal; "volatile" is the C/C++ volatile,
+which ensures that every volatile load and store happens and is performed in the
+stated order.  A couple examples: if a SequentiallyConsistent store is
+immediately followed by another SequentiallyConsistent store to the same
+address, the first store can be erased. This transformation is not allowed for a
+pair of volatile stores. On the other hand, a non-volatile non-atomic load can
+be moved across a volatile load freely, but not an Acquire load.
+
+This document is intended to provide a guide to anyone either writing a frontend
+for LLVM or working on optimization passes for LLVM with a guide for how to deal
+with instructions with special semantics in the presence of concurrency.  This
+is not intended to be a precise guide to the semantics; the details can get
+extremely complicated and unreadable, and are not usually necessary.
+
+.. _Optimization outside atomic:
+
+Optimization outside atomic
+===========================
+
+The basic ``'load'`` and ``'store'`` allow a variety of optimizations, but can
+lead to undefined results in a concurrent environment; see `NotAtomic`_. This
+section specifically goes into the one optimizer restriction which applies in
+concurrent environments, which gets a bit more of an extended description
+because any optimization dealing with stores needs to be aware of it.
+
+From the optimizer's point of view, the rule is that if there are not any
+instructions with atomic ordering involved, concurrency does not matter, with
+one exception: if a variable might be visible to another thread or signal
+handler, a store cannot be inserted along a path where it might not execute
+otherwise.  Take the following example:
+
+.. code-block:: c
+
+ /* C code, for readability; run through clang -O2 -S -emit-llvm to get
+     equivalent IR */
+  int x;
+  void f(int* a) {
+    for (int i = 0; i < 100; i++) {
+      if (a[i])
+        x += 1;
+    }
+  }
+
+The following is equivalent in non-concurrent situations:
+
+.. code-block:: c
+
+  int x;
+  void f(int* a) {
+    int xtemp = x;
+    for (int i = 0; i < 100; i++) {
+      if (a[i])
+        xtemp += 1;
+    }
+    x = xtemp;
+  }
+
+However, LLVM is not allowed to transform the former to the latter: it could
+indirectly introduce undefined behavior if another thread can access ``x`` at
+the same time. (This example is particularly of interest because before the
+concurrency model was implemented, LLVM would perform this transformation.)
+
+Note that speculative loads are allowed; a load which is part of a race returns
+``undef``, but does not have undefined behavior.
+
+Atomic instructions
+===================
+
+For cases where simple loads and stores are not sufficient, LLVM provides
+various atomic instructions. The exact guarantees provided depend on the
+ordering; see `Atomic orderings`_.
+
+``load atomic`` and ``store atomic`` provide the same basic functionality as
+non-atomic loads and stores, but provide additional guarantees in situations
+where threads and signals are involved.
+
+``cmpxchg`` and ``atomicrmw`` are essentially like an atomic load followed by an
+atomic store (where the store is conditional for ``cmpxchg``), but no other
+memory operation can happen on any thread between the load and store.
+
+A ``fence`` provides Acquire and/or Release ordering which is not part of
+another operation; it is normally used along with Monotonic memory operations.
+A Monotonic load followed by an Acquire fence is roughly equivalent to an
+Acquire load, and a Monotonic store following a Release fence is roughly
+equivalent to a Release store. SequentiallyConsistent fences behave as both
+an Acquire and a Release fence, and offer some additional complicated
+guarantees, see the C++11 standard for details.
+
+Frontends generating atomic instructions generally need to be aware of the
+target to some degree; atomic instructions are guaranteed to be lock-free, and
+therefore an instruction which is wider than the target natively supports can be
+impossible to generate.
+
+.. _Atomic orderings:
+
+Atomic orderings
+================
+
+In order to achieve a balance between performance and necessary guarantees,
+there are six levels of atomicity. They are listed in order of strength; each
+level includes all the guarantees of the previous level except for
+Acquire/Release. (See also `LangRef Ordering <LangRef.html#ordering>`_.)
+
+.. _NotAtomic:
+
+NotAtomic
+---------
+
+NotAtomic is the obvious, a load or store which is not atomic. (This isn't
+really a level of atomicity, but is listed here for comparison.) This is
+essentially a regular load or store. If there is a race on a given memory
+location, loads from that location return undef.
+
+Relevant standard
+  This is intended to match shared variables in C/C++, and to be used in any
+  other context where memory access is necessary, and a race is impossible. (The
+  precise definition is in `LangRef Memory Model <LangRef.html#memmodel>`_.)
+
+Notes for frontends
+  The rule is essentially that all memory accessed with basic loads and stores
+  by multiple threads should be protected by a lock or other synchronization;
+  otherwise, you are likely to run into undefined behavior. If your frontend is
+  for a "safe" language like Java, use Unordered to load and store any shared
+  variable.  Note that NotAtomic volatile loads and stores are not properly
+  atomic; do not try to use them as a substitute. (Per the C/C++ standards,
+  volatile does provide some limited guarantees around asynchronous signals, but
+  atomics are generally a better solution.)
+
+Notes for optimizers
+  Introducing loads to shared variables along a codepath where they would not
+  otherwise exist is allowed; introducing stores to shared variables is not. See
+  `Optimization outside atomic`_.
+
+Notes for code generation
+  The one interesting restriction here is that it is not allowed to write to
+  bytes outside of the bytes relevant to a store.  This is mostly relevant to
+  unaligned stores: it is not allowed in general to convert an unaligned store
+  into two aligned stores of the same width as the unaligned store. Backends are
+  also expected to generate an i8 store as an i8 store, and not an instruction
+  which writes to surrounding bytes.  (If you are writing a backend for an
+  architecture which cannot satisfy these restrictions and cares about
+  concurrency, please send an email to llvm-dev.)
+
+Unordered
+---------
+
+Unordered is the lowest level of atomicity. It essentially guarantees that races
+produce somewhat sane results instead of having undefined behavior.  It also
+guarantees the operation to be lock-free, so it does not depend on the data
+being part of a special atomic structure or depend on a separate per-process
+global lock.  Note that code generation will fail for unsupported atomic
+operations; if you need such an operation, use explicit locking.
+
+Relevant standard
+  This is intended to match the Java memory model for shared variables.
+
+Notes for frontends
+  This cannot be used for synchronization, but is useful for Java and other
+  "safe" languages which need to guarantee that the generated code never
+  exhibits undefined behavior. Note that this guarantee is cheap on common
+  platforms for loads of a native width, but can be expensive or unavailable for
+  wider loads, like a 64-bit store on ARM. (A frontend for Java or other "safe"
+  languages would normally split a 64-bit store on ARM into two 32-bit unordered
+  stores.)
+
+Notes for optimizers
+  In terms of the optimizer, this prohibits any transformation that transforms a
+  single load into multiple loads, transforms a store into multiple stores,
+  narrows a store, or stores a value which would not be stored otherwise.  Some
+  examples of unsafe optimizations are narrowing an assignment into a bitfield,
+  rematerializing a load, and turning loads and stores into a memcpy
+  call. Reordering unordered operations is safe, though, and optimizers should
+  take advantage of that because unordered operations are common in languages
+  that need them.
+
+Notes for code generation
+  These operations are required to be atomic in the sense that if you use
+  unordered loads and unordered stores, a load cannot see a value which was
+  never stored.  A normal load or store instruction is usually sufficient, but
+  note that an unordered load or store cannot be split into multiple
+  instructions (or an instruction which does multiple memory operations, like
+  ``LDRD`` on ARM without LPAE, or not naturally-aligned ``LDRD`` on LPAE ARM).
+
+Monotonic
+---------
+
+Monotonic is the weakest level of atomicity that can be used in synchronization
+primitives, although it does not provide any general synchronization. It
+essentially guarantees that if you take all the operations affecting a specific
+address, a consistent ordering exists.
+
+Relevant standard
+  This corresponds to the C++11/C11 ``memory_order_relaxed``; see those
+  standards for the exact definition.
+
+Notes for frontends
+  If you are writing a frontend which uses this directly, use with caution.  The
+  guarantees in terms of synchronization are very weak, so make sure these are
+  only used in a pattern which you know is correct.  Generally, these would
+  either be used for atomic operations which do not protect other memory (like
+  an atomic counter), or along with a ``fence``.
+
+Notes for optimizers
+  In terms of the optimizer, this can be treated as a read+write on the relevant
+  memory location (and alias analysis will take advantage of that). In addition,
+  it is legal to reorder non-atomic and Unordered loads around Monotonic
+  loads. CSE/DSE and a few other optimizations are allowed, but Monotonic
+  operations are unlikely to be used in ways which would make those
+  optimizations useful.
+
+Notes for code generation
+  Code generation is essentially the same as that for unordered for loads and
+  stores.  No fences are required.  ``cmpxchg`` and ``atomicrmw`` are required
+  to appear as a single operation.
+
+Acquire
+-------
+
+Acquire provides a barrier of the sort necessary to acquire a lock to access
+other memory with normal loads and stores.
+
+Relevant standard
+  This corresponds to the C++11/C11 ``memory_order_acquire``. It should also be
+  used for C++11/C11 ``memory_order_consume``.
+
+Notes for frontends
+  If you are writing a frontend which uses this directly, use with caution.
+  Acquire only provides a semantic guarantee when paired with a Release
+  operation.
+
+Notes for optimizers
+  Optimizers not aware of atomics can treat this like a nothrow call.  It is
+  also possible to move stores from before an Acquire load or read-modify-write
+  operation to after it, and move non-Acquire loads from before an Acquire
+  operation to after it.
+
+Notes for code generation
+  Architectures with weak memory ordering (essentially everything relevant today
+  except x86 and SPARC) require some sort of fence to maintain the Acquire
+  semantics.  The precise fences required varies widely by architecture, but for
+  a simple implementation, most architectures provide a barrier which is strong
+  enough for everything (``dmb`` on ARM, ``sync`` on PowerPC, etc.).  Putting
+  such a fence after the equivalent Monotonic operation is sufficient to
+  maintain Acquire semantics for a memory operation.
+
+Release
+-------
+
+Release is similar to Acquire, but with a barrier of the sort necessary to
+release a lock.
+
+Relevant standard
+  This corresponds to the C++11/C11 ``memory_order_release``.
+
+Notes for frontends
+  If you are writing a frontend which uses this directly, use with caution.
+  Release only provides a semantic guarantee when paired with a Acquire
+  operation.
+
+Notes for optimizers
+  Optimizers not aware of atomics can treat this like a nothrow call.  It is
+  also possible to move loads from after a Release store or read-modify-write
+  operation to before it, and move non-Release stores from after an Release
+  operation to before it.
+
+Notes for code generation
+  See the section on Acquire; a fence before the relevant operation is usually
+  sufficient for Release. Note that a store-store fence is not sufficient to
+  implement Release semantics; store-store fences are generally not exposed to
+  IR because they are extremely difficult to use correctly.
+
+AcquireRelease
+--------------
+
+AcquireRelease (``acq_rel`` in IR) provides both an Acquire and a Release
+barrier (for fences and operations which both read and write memory).
+
+Relevant standard
+  This corresponds to the C++11/C11 ``memory_order_acq_rel``.
+
+Notes for frontends
+  If you are writing a frontend which uses this directly, use with caution.
+  Acquire only provides a semantic guarantee when paired with a Release
+  operation, and vice versa.
+
+Notes for optimizers
+  In general, optimizers should treat this like a nothrow call; the possible
+  optimizations are usually not interesting.
+
+Notes for code generation
+  This operation has Acquire and Release semantics; see the sections on Acquire
+  and Release.
+
+SequentiallyConsistent
+----------------------
+
+SequentiallyConsistent (``seq_cst`` in IR) provides Acquire semantics for loads
+and Release semantics for stores. Additionally, it guarantees that a total
+ordering exists between all SequentiallyConsistent operations.
+
+Relevant standard
+  This corresponds to the C++11/C11 ``memory_order_seq_cst``, Java volatile, and
+  the gcc-compatible ``__sync_*`` builtins which do not specify otherwise.
+
+Notes for frontends
+  If a frontend is exposing atomic operations, these are much easier to reason
+  about for the programmer than other kinds of operations, and using them is
+  generally a practical performance tradeoff.
+
+Notes for optimizers
+  Optimizers not aware of atomics can treat this like a nothrow call.  For
+  SequentiallyConsistent loads and stores, the same reorderings are allowed as
+  for Acquire loads and Release stores, except that SequentiallyConsistent
+  operations may not be reordered.
+
+Notes for code generation
+  SequentiallyConsistent loads minimally require the same barriers as Acquire
+  operations and SequentiallyConsistent stores require Release
+  barriers. Additionally, the code generator must enforce ordering between
+  SequentiallyConsistent stores followed by SequentiallyConsistent loads. This
+  is usually done by emitting either a full fence before the loads or a full
+  fence after the stores; which is preferred varies by architecture.
+
+Atomics and IR optimization
+===========================
+
+Predicates for optimizer writers to query:
+
+* ``isSimple()``: A load or store which is not volatile or atomic.  This is
+  what, for example, memcpyopt would check for operations it might transform.
+
+* ``isUnordered()``: A load or store which is not volatile and at most
+  Unordered. This would be checked, for example, by LICM before hoisting an
+  operation.
+
+* ``mayReadFromMemory()``/``mayWriteToMemory()``: Existing predicate, but note
+  that they return true for any operation which is volatile or at least
+  Monotonic.
+
+* ``isStrongerThan`` / ``isAtLeastOrStrongerThan``: These are predicates on
+  orderings. They can be useful for passes that are aware of atomics, for
+  example to do DSE across a single atomic access, but not across a
+  release-acquire pair (see MemoryDependencyAnalysis for an example of this)
+
+* Alias analysis: Note that AA will return ModRef for anything Acquire or
+  Release, and for the address accessed by any Monotonic operation.
+
+To support optimizing around atomic operations, make sure you are using the
+right predicates; everything should work if that is done.  If your pass should
+optimize some atomic operations (Unordered operations in particular), make sure
+it doesn't replace an atomic load or store with a non-atomic operation.
+
+Some examples of how optimizations interact with various kinds of atomic
+operations:
+
+* ``memcpyopt``: An atomic operation cannot be optimized into part of a
+  memcpy/memset, including unordered loads/stores.  It can pull operations
+  across some atomic operations.
+
+* LICM: Unordered loads/stores can be moved out of a loop.  It just treats
+  monotonic operations like a read+write to a memory location, and anything
+  stricter than that like a nothrow call.
+
+* DSE: Unordered stores can be DSE'ed like normal stores.  Monotonic stores can
+  be DSE'ed in some cases, but it's tricky to reason about, and not especially
+  important. It is possible in some case for DSE to operate across a stronger
+  atomic operation, but it is fairly tricky. DSE delegates this reasoning to
+  MemoryDependencyAnalysis (which is also used by other passes like GVN).
+
+* Folding a load: Any atomic load from a constant global can be constant-folded,
+  because it cannot be observed.  Similar reasoning allows sroa with
+  atomic loads and stores.
+
+Atomics and Codegen
+===================
+
+Atomic operations are represented in the SelectionDAG with ``ATOMIC_*`` opcodes.
+On architectures which use barrier instructions for all atomic ordering (like
+ARM), appropriate fences can be emitted by the AtomicExpand Codegen pass if
+``setInsertFencesForAtomic()`` was used.
+
+The MachineMemOperand for all atomic operations is currently marked as volatile;
+this is not correct in the IR sense of volatile, but CodeGen handles anything
+marked volatile very conservatively.  This should get fixed at some point.
+
+One very important property of the atomic operations is that if your backend
+supports any inline lock-free atomic operations of a given size, you should
+support *ALL* operations of that size in a lock-free manner.
+
+When the target implements atomic ``cmpxchg`` or LL/SC instructions (as most do)
+this is trivial: all the other operations can be implemented on top of those
+primitives. However, on many older CPUs (e.g. ARMv5, SparcV8, Intel 80386) there
+are atomic load and store instructions, but no ``cmpxchg`` or LL/SC. As it is
+invalid to implement ``atomic load`` using the native instruction, but
+``cmpxchg`` using a library call to a function that uses a mutex, ``atomic
+load`` must *also* expand to a library call on such architectures, so that it
+can remain atomic with regards to a simultaneous ``cmpxchg``, by using the same
+mutex.
+
+AtomicExpandPass can help with that: it will expand all atomic operations to the
+proper ``__atomic_*`` libcalls for any size above the maximum set by
+``setMaxAtomicSizeInBitsSupported`` (which defaults to 0).
+
+On x86, all atomic loads generate a ``MOV``. SequentiallyConsistent stores
+generate an ``XCHG``, other stores generate a ``MOV``. SequentiallyConsistent
+fences generate an ``MFENCE``, other fences do not cause any code to be
+generated.  ``cmpxchg`` uses the ``LOCK CMPXCHG`` instruction.  ``atomicrmw xchg``
+uses ``XCHG``, ``atomicrmw add`` and ``atomicrmw sub`` use ``XADD``, and all
+other ``atomicrmw`` operations generate a loop with ``LOCK CMPXCHG``.  Depending
+on the users of the result, some ``atomicrmw`` operations can be translated into
+operations like ``LOCK AND``, but that does not work in general.
+
+On ARM (before v8), MIPS, and many other RISC architectures, Acquire, Release,
+and SequentiallyConsistent semantics require barrier instructions for every such
+operation. Loads and stores generate normal instructions.  ``cmpxchg`` and
+``atomicrmw`` can be represented using a loop with LL/SC-style instructions
+which take some sort of exclusive lock on a cache line (``LDREX`` and ``STREX``
+on ARM, etc.).
+
+It is often easiest for backends to use AtomicExpandPass to lower some of the
+atomic constructs. Here are some lowerings it can do:
+
+* cmpxchg -> loop with load-linked/store-conditional
+  by overriding ``shouldExpandAtomicCmpXchgInIR()``, ``emitLoadLinked()``,
+  ``emitStoreConditional()``
+* large loads/stores -> ll-sc/cmpxchg
+  by overriding ``shouldExpandAtomicStoreInIR()``/``shouldExpandAtomicLoadInIR()``
+* strong atomic accesses -> monotonic accesses + fences by overriding
+  ``shouldInsertFencesForAtomic()``, ``emitLeadingFence()``, and
+  ``emitTrailingFence()``
+* atomic rmw -> loop with cmpxchg or load-linked/store-conditional
+  by overriding ``expandAtomicRMWInIR()``
+* expansion to __atomic_* libcalls for unsupported sizes.
+* part-word atomicrmw/cmpxchg -> target-specific intrinsic by overriding
+  ``shouldExpandAtomicRMWInIR``, ``emitMaskedAtomicRMWIntrinsic``,
+  ``shouldExpandAtomicCmpXchgInIR``, and ``emitMaskedAtomicCmpXchgIntrinsic``.
+
+For an example of these look at the ARM (first five lowerings) or RISC-V (last
+lowering) backend.
+
+AtomicExpandPass supports two strategies for lowering atomicrmw/cmpxchg to
+load-linked/store-conditional (LL/SC) loops. The first expands the LL/SC loop
+in IR, calling target lowering hooks to emit intrinsics for the LL and SC
+operations. However, many architectures have strict requirements for LL/SC
+loops to ensure forward progress, such as restrictions on the number and type
+of instructions in the loop. It isn't possible to enforce these restrictions
+when the loop is expanded in LLVM IR, and so affected targets may prefer to
+expand to LL/SC loops at a very late stage (i.e. after register allocation).
+AtomicExpandPass can help support lowering of part-word atomicrmw or cmpxchg
+using this strategy by producing IR for any shifting and masking that can be
+performed outside of the LL/SC loop.
+
+Libcalls: __atomic_*
+====================
+
+There are two kinds of atomic library calls that are generated by LLVM. Please
+note that both sets of library functions somewhat confusingly share the names of
+builtin functions defined by clang. Despite this, the library functions are
+not directly related to the builtins: it is *not* the case that ``__atomic_*``
+builtins lower to ``__atomic_*`` library calls and ``__sync_*`` builtins lower
+to ``__sync_*`` library calls.
+
+The first set of library functions are named ``__atomic_*``. This set has been
+"standardized" by GCC, and is described below. (See also `GCC's documentation
+<https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary>`_)
+
+LLVM's AtomicExpandPass will translate atomic operations on data sizes above
+``MaxAtomicSizeInBitsSupported`` into calls to these functions.
+
+There are four generic functions, which can be called with data of any size or
+alignment::
+
+   void __atomic_load(size_t size, void *ptr, void *ret, int ordering)
+   void __atomic_store(size_t size, void *ptr, void *val, int ordering)
+   void __atomic_exchange(size_t size, void *ptr, void *val, void *ret, int ordering)
+   bool __atomic_compare_exchange(size_t size, void *ptr, void *expected, void *desired, int success_order, int failure_order)
+
+There are also size-specialized versions of the above functions, which can only
+be used with *naturally-aligned* pointers of the appropriate size. In the
+signatures below, "N" is one of 1, 2, 4, 8, and 16, and "iN" is the appropriate
+integer type of that size; if no such integer type exists, the specialization
+cannot be used::
+
+   iN __atomic_load_N(iN *ptr, iN val, int ordering)
+   void __atomic_store_N(iN *ptr, iN val, int ordering)
+   iN __atomic_exchange_N(iN *ptr, iN val, int ordering)
+   bool __atomic_compare_exchange_N(iN *ptr, iN *expected, iN desired, int success_order, int failure_order)
+
+Finally there are some read-modify-write functions, which are only available in
+the size-specific variants (any other sizes use a ``__atomic_compare_exchange``
+loop)::
+
+   iN __atomic_fetch_add_N(iN *ptr, iN val, int ordering)
+   iN __atomic_fetch_sub_N(iN *ptr, iN val, int ordering)
+   iN __atomic_fetch_and_N(iN *ptr, iN val, int ordering)
+   iN __atomic_fetch_or_N(iN *ptr, iN val, int ordering)
+   iN __atomic_fetch_xor_N(iN *ptr, iN val, int ordering)
+   iN __atomic_fetch_nand_N(iN *ptr, iN val, int ordering)
+
+This set of library functions have some interesting implementation requirements
+to take note of:
+
+- They support all sizes and alignments -- including those which cannot be
+  implemented natively on any existing hardware. Therefore, they will certainly
+  use mutexes in for some sizes/alignments.
+
+- As a consequence, they cannot be shipped in a statically linked
+  compiler-support library, as they have state which must be shared amongst all
+  DSOs loaded in the program. They must be provided in a shared library used by
+  all objects.
+
+- The set of atomic sizes supported lock-free must be a superset of the sizes
+  any compiler can emit. That is: if a new compiler introduces support for
+  inline-lock-free atomics of size N, the ``__atomic_*`` functions must also have a
+  lock-free implementation for size N. This is a requirement so that code
+  produced by an old compiler (which will have called the ``__atomic_*`` function)
+  interoperates with code produced by the new compiler (which will use native
+  the atomic instruction).
+
+Note that it's possible to write an entirely target-independent implementation
+of these library functions by using the compiler atomic builtins themselves to
+implement the operations on naturally-aligned pointers of supported sizes, and a
+generic mutex implementation otherwise.
+
+Libcalls: __sync_*
+==================
+
+Some targets or OS/target combinations can support lock-free atomics, but for
+various reasons, it is not practical to emit the instructions inline.
+
+There's two typical examples of this.
+
+Some CPUs support multiple instruction sets which can be swiched back and forth
+on function-call boundaries. For example, MIPS supports the MIPS16 ISA, which
+has a smaller instruction encoding than the usual MIPS32 ISA. ARM, similarly,
+has the Thumb ISA. In MIPS16 and earlier versions of Thumb, the atomic
+instructions are not encodable. However, those instructions are available via a
+function call to a function with the longer encoding.
+
+Additionally, a few OS/target pairs provide kernel-supported lock-free
+atomics. ARM/Linux is an example of this: the kernel `provides
+<https://www.kernel.org/doc/Documentation/arm/kernel_user_helpers.txt>`_ a
+function which on older CPUs contains a "magically-restartable" atomic sequence
+(which looks atomic so long as there's only one CPU), and contains actual atomic
+instructions on newer multicore models. This sort of functionality can typically
+be provided on any architecture, if all CPUs which are missing atomic
+compare-and-swap support are uniprocessor (no SMP). This is almost always the
+case. The only common architecture without that property is SPARC -- SPARCV8 SMP
+systems were common, yet it doesn't support any sort of compare-and-swap
+operation.
+
+In either of these cases, the Target in LLVM can claim support for atomics of an
+appropriate size, and then implement some subset of the operations via libcalls
+to a ``__sync_*`` function. Such functions *must* not use locks in their
+implementation, because unlike the ``__atomic_*`` routines used by
+AtomicExpandPass, these may be mixed-and-matched with native instructions by the
+target lowering.
+
+Further, these routines do not need to be shared, as they are stateless. So,
+there is no issue with having multiple copies included in one binary. Thus,
+typically these routines are implemented by the statically-linked compiler
+runtime support library.
+
+LLVM will emit a call to an appropriate ``__sync_*`` routine if the target
+ISelLowering code has set the corresponding ``ATOMIC_CMPXCHG``, ``ATOMIC_SWAP``,
+or ``ATOMIC_LOAD_*`` operation to "Expand", and if it has opted-into the
+availability of those library functions via a call to ``initSyncLibcalls()``.
+
+The full set of functions that may be called by LLVM is (for ``N`` being 1, 2,
+4, 8, or 16)::
+
+  iN __sync_val_compare_and_swap_N(iN *ptr, iN expected, iN desired)
+  iN __sync_lock_test_and_set_N(iN *ptr, iN val)
+  iN __sync_fetch_and_add_N(iN *ptr, iN val)
+  iN __sync_fetch_and_sub_N(iN *ptr, iN val)
+  iN __sync_fetch_and_and_N(iN *ptr, iN val)
+  iN __sync_fetch_and_or_N(iN *ptr, iN val)
+  iN __sync_fetch_and_xor_N(iN *ptr, iN val)
+  iN __sync_fetch_and_nand_N(iN *ptr, iN val)
+  iN __sync_fetch_and_max_N(iN *ptr, iN val)
+  iN __sync_fetch_and_umax_N(iN *ptr, iN val)
+  iN __sync_fetch_and_min_N(iN *ptr, iN val)
+  iN __sync_fetch_and_umin_N(iN *ptr, iN val)
+
+This list doesn't include any function for atomic load or store; all known
+architectures support atomic loads and stores directly (possibly by emitting a
+fence on either side of a normal load or store.)
+
+There's also, somewhat separately, the possibility to lower ``ATOMIC_FENCE`` to
+``__sync_synchronize()``. This may happen or not happen independent of all the
+above, controlled purely by ``setOperationAction(ISD::ATOMIC_FENCE, ...)``.

Added: www-releases/trunk/9.0.0/docs/_sources/Benchmarking.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/Benchmarking.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/Benchmarking.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/Benchmarking.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,87 @@
+==================================
+Benchmarking tips
+==================================
+
+
+Introduction
+============
+
+For benchmarking a patch we want to reduce all possible sources of
+noise as much as possible. How to do that is very OS dependent.
+
+Note that low noise is required, but not sufficient. It does not
+exclude measurement bias. See
+https://www.cis.upenn.edu/~cis501/papers/producing-wrong-data.pdf for
+example.
+
+General
+================================
+
+* Use a high resolution timer, e.g. perf under linux.
+
+* Run the benchmark multiple times to be able to recognize noise.
+
+* Disable as many processes or services as possible on the target system.
+
+* Disable frequency scaling, turbo boost and address space
+  randomization (see OS specific section).
+
+* Static link if the OS supports it. That avoids any variation that
+  might be introduced by loading dynamic libraries. This can be done
+  by passing ``-DLLVM_BUILD_STATIC=ON`` to cmake.
+
+* Try to avoid storage. On some systems you can use tmpfs. Putting the
+  program, inputs and outputs on tmpfs avoids touching a real storage
+  system, which can have a pretty big variability.
+
+  To mount it (on linux and freebsd at least)::
+
+    mount -t tmpfs -o size=<XX>g none dir_to_mount
+
+Linux
+=====
+
+* Disable address space randomization::
+
+    echo 0 > /proc/sys/kernel/randomize_va_space
+
+* Set scaling_governor to performance::
+
+   for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
+   do
+     echo performance > /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
+   done
+
+* Use https://github.com/lpechacek/cpuset to reserve cpus for just the
+  program you are benchmarking. If using perf, leave at least 2 cores
+  so that perf runs in one and your program in another::
+
+    cset shield -c N1,N2 -k on
+
+  This will move all threads out of N1 and N2. The ``-k on`` means
+  that even kernel threads are moved out.
+
+* Disable the SMT pair of the cpus you will use for the benchmark. The
+  pair of cpu N can be found in
+  ``/sys/devices/system/cpu/cpuN/topology/thread_siblings_list`` and
+  disabled with::
+
+    echo 0 > /sys/devices/system/cpu/cpuX/online
+
+
+* Run the program with::
+
+    cset shield --exec -- perf stat -r 10 <cmd>
+
+  This will run the command after ``--`` in the isolated cpus. The
+  particular perf command runs the ``<cmd>`` 10 times and reports
+  statistics.
+
+With these in place you can expect perf variations of less than 0.1%.
+
+Linux Intel
+-----------
+
+* Disable turbo mode::
+
+    echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo

Added: www-releases/trunk/9.0.0/docs/_sources/BigEndianNEON.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/BigEndianNEON.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/BigEndianNEON.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/BigEndianNEON.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,205 @@
+==============================================
+Using ARM NEON instructions in big endian mode
+==============================================
+
+.. contents::
+    :local:
+
+Introduction
+============
+
+Generating code for big endian ARM processors is for the most part straightforward. NEON loads and stores however have some interesting properties that make code generation decisions less obvious in big endian mode.
+
+The aim of this document is to explain the problem with NEON loads and stores, and the solution that has been implemented in LLVM.
+
+In this document the term "vector" refers to what the ARM ABI calls a "short vector", which is a sequence of items that can fit in a NEON register. This sequence can be 64 or 128 bits in length, and can constitute 8, 16, 32 or 64 bit items. This document refers to A64 instructions throughout, but is almost applicable to the A32/ARMv7 instruction sets also. The ABI format for passing vectors in A32 is sligtly different to A64. Apart from that, the same concepts apply.
+
+Example: C-level intrinsics -> assembly
+---------------------------------------
+
+It may be helpful first to illustrate how C-level ARM NEON intrinsics are lowered to instructions.
+
+This trivial C function takes a vector of four ints and sets the zero'th lane to the value "42"::
+
+    #include <arm_neon.h>
+    int32x4_t f(int32x4_t p) {
+        return vsetq_lane_s32(42, p, 0);
+    }
+
+arm_neon.h intrinsics generate "generic" IR where possible (that is, normal IR instructions not ``llvm.arm.neon.*`` intrinsic calls). The above generates::
+
+    define <4 x i32> @f(<4 x i32> %p) {
+      %vset_lane = insertelement <4 x i32> %p, i32 42, i32 0
+      ret <4 x i32> %vset_lane
+    }
+
+Which then becomes the following trivial assembly::
+
+    f:                                      // @f
+            movz	w8, #0x2a
+            ins 	v0.s[0], w8
+            ret
+
+Problem
+=======
+
+The main problem is how vectors are represented in memory and in registers.
+
+First, a recap. The "endianness" of an item affects its representation in memory only. In a register, a number is just a sequence of bits - 64 bits in the case of AArch64 general purpose registers. Memory, however, is a sequence of addressable units of 8 bits in size. Any number greater than 8 bits must therefore be split up into 8-bit chunks, and endianness describes the order in which these chunks are laid out in memory.
+
+A "little endian" layout has the least significant byte first (lowest in memory address). A "big endian" layout has the *most* significant byte first. This means that when loading an item from big endian memory, the lowest 8-bits in memory must go in the most significant 8-bits, and so forth.
+
+``LDR`` and ``LD1``
+===================
+
+.. figure:: ARM-BE-ldr.png
+    :align: right
+    
+    Big endian vector load using ``LDR``.
+
+
+A vector is a consecutive sequence of items that are operated on simultaneously. To load a 64-bit vector, 64 bits need to be read from memory. In little endian mode, we can do this by just performing a 64-bit load - ``LDR q0, [foo]``. However if we try this in big endian mode, because of the byte swapping the lane indices end up being swapped! The zero'th item as laid out in memory becomes the n'th lane in the vector.
+
+.. figure:: ARM-BE-ld1.png
+    :align: right
+
+    Big endian vector load using ``LD1``. Note that the lanes retain the correct ordering.
+
+
+Because of this, the instruction ``LD1`` performs a vector load but performs byte swapping not on the entire 64 bits, but on the individual items within the vector. This means that the register content is the same as it would have been on a little endian system.
+
+It may seem that ``LD1`` should suffice to peform vector loads on a big endian machine. However there are pros and cons to the two approaches that make it less than simple which register format to pick.
+
+There are two options:
+
+    1. The content of a vector register is the same *as if* it had been loaded with an ``LDR`` instruction.
+    2. The content of a vector register is the same *as if* it had been loaded with an ``LD1`` instruction.
+
+Because ``LD1 == LDR + REV`` and similarly ``LDR == LD1 + REV`` (on a big endian system), we can simulate either type of load with the other type of load plus a ``REV`` instruction. So we're not deciding which instructions to use, but which format to use (which will then influence which instruction is best to use).
+
+.. The 'clearer' container is required to make the following section header come after the floated
+   images above.
+.. container:: clearer
+
+    Note that throughout this section we only mention loads. Stores have exactly the same problems as their associated loads, so have been skipped for brevity.
+ 
+
+Considerations
+==============
+
+LLVM IR Lane ordering
+---------------------
+
+LLVM IR has first class vector types. In LLVM IR, the zero'th element of a vector resides at the lowest memory address. The optimizer relies on this property in certain areas, for example when concatenating vectors together. The intention is for arrays and vectors to have identical memory layouts - ``[4 x i8]`` and ``<4 x i8>`` should be represented the same in memory. Without this property there would be many special cases that the optimizer would have to cleverly handle.
+
+Use of ``LDR`` would break this lane ordering property. This doesn't preclude the use of ``LDR``, but we would have to do one of two things:
+
+   1. Insert a ``REV`` instruction to reverse the lane order after every ``LDR``.
+   2. Disable all optimizations that rely on lane layout, and for every access to an individual lane (``insertelement``/``extractelement``/``shufflevector``) reverse the lane index.
+
+AAPCS
+-----
+
+The ARM procedure call standard (AAPCS) defines the ABI for passing vectors between functions in registers. It states:
+
+    When a short vector is transferred between registers and memory it is treated as an opaque object. That is a short vector is stored in memory as if it were stored with a single ``STR`` of the entire register; a short vector is loaded from memory using the corresponding ``LDR`` instruction. On a little-endian system this means that element 0 will always contain the lowest addressed element of a short vector; on a big-endian system element 0 will contain the highest-addressed element of a short vector.
+
+    -- Procedure Call Standard for the ARM 64-bit Architecture (AArch64), 4.1.2 Short Vectors
+
+The use of ``LDR`` and ``STR`` as the ABI defines has at least one advantage over ``LD1`` and ``ST1``. ``LDR`` and ``STR`` are oblivious to the size of the individual lanes of a vector. ``LD1`` and ``ST1`` are not - the lane size is encoded within them. This is important across an ABI boundary, because it would become necessary to know the lane width the callee expects. Consider the following code:
+
+.. code-block:: c
+
+    <callee.c>
+    void callee(uint32x2_t v) {
+      ...
+    }
+
+    <caller.c>
+    extern void callee(uint32x2_t);
+    void caller() {
+      callee(...);
+    }
+
+If ``callee`` changed its signature to ``uint16x4_t``, which is equivalent in register content, if we passed as ``LD1`` we'd break this code until ``caller`` was updated and recompiled.
+
+There is an argument that if the signatures of the two functions are different then the behaviour should be undefined. But there may be functions that are agnostic to the lane layout of the vector, and treating the vector as an opaque value (just loading it and storing it) would be impossible without a common format across ABI boundaries.
+
+So to preserve ABI compatibility, we need to use the ``LDR`` lane layout across function calls.
+
+Alignment
+---------
+
+In strict alignment mode, ``LDR qX`` requires its address to be 128-bit aligned, whereas ``LD1`` only requires it to be as aligned as the lane size. If we canonicalised on using ``LDR``, we'd still need to use ``LD1`` in some places to avoid alignment faults (the result of the ``LD1`` would then need to be reversed with ``REV``).
+
+Most operating systems however do not run with alignment faults enabled, so this is often not an issue.
+
+Summary
+-------
+
+The following table summarises the instructions that are required to be emitted for each property mentioned above for each of the two solutions.
+
++-------------------------------+-------------------------------+---------------------+
+|                               | ``LDR`` layout                | ``LD1`` layout      |
++===============================+===============================+=====================+
+| Lane ordering                 |   ``LDR + REV``               |    ``LD1``          |
++-------------------------------+-------------------------------+---------------------+
+| AAPCS                         |   ``LDR``                     |    ``LD1 + REV``    |
++-------------------------------+-------------------------------+---------------------+
+| Alignment for strict mode     |   ``LDR`` / ``LD1 + REV``     |    ``LD1``          |
++-------------------------------+-------------------------------+---------------------+
+
+Neither approach is perfect, and choosing one boils down to choosing the lesser of two evils. The issue with lane ordering, it was decided, would have to change target-agnostic compiler passes and would result in a strange IR in which lane indices were reversed. It was decided that this was worse than the changes that would have to be made to support ``LD1``, so ``LD1`` was chosen as the canonical vector load instruction (and by inference, ``ST1`` for vector stores).
+
+Implementation
+==============
+
+There are 3 parts to the implementation:
+
+    1. Predicate ``LDR`` and ``STR`` instructions so that they are never allowed to be selected to generate vector loads and stores. The exception is one-lane vectors [1]_ - these by definition cannot have lane ordering problems so are fine to use ``LDR``/``STR``. 
+
+    2. Create code generation patterns for bitconverts that create ``REV`` instructions.
+
+    3. Make sure appropriate bitconverts are created so that vector values get passed over call boundaries as 1-element vectors (which is the same as if they were loaded with ``LDR``).
+
+Bitconverts
+-----------
+
+.. image:: ARM-BE-bitcastfail.png
+    :align: right
+
+The main problem with the ``LD1`` solution is dealing with bitconverts (or bitcasts, or reinterpret casts). These are pseudo instructions that only change the compiler's interpretation of data, not the underlying data itself. A requirement is that if data is loaded and then saved again (called a "round trip"), the memory contents should be the same after the store as before the load. If a vector is loaded and is then bitconverted to a different vector type before storing, the round trip will currently be broken.
+
+Take for example this code sequence::
+
+    %0 = load <4 x i32> %x
+    %1 = bitcast <4 x i32> %0 to <2 x i64>
+         store <2 x i64> %1, <2 x i64>* %y
+
+This would produce a code sequence such as that in the figure on the right. The mismatched ``LD1`` and ``ST1`` cause the stored data to differ from the loaded data.
+
+.. container:: clearer
+
+    When we see a bitcast from type ``X`` to type ``Y``, what we need to do is to change the in-register representation of the data to be *as if* it had just been loaded by a ``LD1`` of type ``Y``.
+
+.. image:: ARM-BE-bitcastsuccess.png
+    :align: right
+
+Conceptually this is simple - we can insert a ``REV`` undoing the ``LD1`` of type ``X`` (converting the in-register representation to the same as if it had been loaded by ``LDR``) and then insert another ``REV`` to change the representation to be as if it had been loaded by an ``LD1`` of type ``Y``.
+
+For the previous example, this would be::
+
+    LD1   v0.4s, [x]
+
+    REV64 v0.4s, v0.4s                  // There is no REV128 instruction, so it must be synthesizedcd 
+    EXT   v0.16b, v0.16b, v0.16b, #8    // with a REV64 then an EXT to swap the two 64-bit elements.
+
+    REV64 v0.2d, v0.2d
+    EXT   v0.16b, v0.16b, v0.16b, #8
+
+    ST1   v0.2d, [y]
+
+It turns out that these ``REV`` pairs can, in almost all cases, be squashed together into a single ``REV``. For the example above, a ``REV128 4s`` + ``REV128 2d`` is actually a ``REV64 4s``, as shown in the figure on the right.
+
+.. [1] One lane vectors may seem useless as a concept but they serve to distinguish between values held in general purpose registers and values held in NEON/VFP registers. For example, an ``i64`` would live in an ``x`` register, but ``<1 x i64>`` would live in a ``d`` register.
+




More information about the llvm-commits mailing list