[llvm] d9daee5 - [AMDGPU][DOC][NFC] Update assembler syntax description

Dmitry Preobrazhensky via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 20 03:04:04 PST 2022


Author: Dmitry Preobrazhensky
Date: 2022-12-20T14:03:46+03:00
New Revision: d9daee5a669832586c59e92223cae5600238b4cf

URL: https://github.com/llvm/llvm-project/commit/d9daee5a669832586c59e92223cae5600238b4cf
DIFF: https://github.com/llvm/llvm-project/commit/d9daee5a669832586c59e92223cae5600238b4cf.diff

LOG: [AMDGPU][DOC][NFC] Update assembler syntax description

Summary of changes:
- Enable register tuples with 9, 10, 11 and 12 registers (https://reviews.llvm.org/D138205).
- Small improvements and clarifications.
- Correct typos.

Added: 
    

Modified: 
    llvm/docs/AMDGPUInstructionNotation.rst
    llvm/docs/AMDGPUInstructionSyntax.rst
    llvm/docs/AMDGPUModifierSyntax.rst
    llvm/docs/AMDGPUOperandSyntax.rst

Removed: 
    


################################################################################
diff  --git a/llvm/docs/AMDGPUInstructionNotation.rst b/llvm/docs/AMDGPUInstructionNotation.rst
index 21b4d3bea3407..42e0c16235f4a 100644
--- a/llvm/docs/AMDGPUInstructionNotation.rst
+++ b/llvm/docs/AMDGPUInstructionNotation.rst
@@ -10,10 +10,10 @@ AMDGPU Instructions Notation
 Introduction
 ============
 
-This is an overview of notation used to describe syntax of AMDGPU assembler instructions.
+This is an overview of notation used to describe the syntax of AMDGPU assembler instructions.
 
-This notation mimics the :ref:`syntax of assembler instructions<amdgpu_syn_instructions>`
-except that instead of real operands and modifiers it provides references to their description.
+This notation looks a lot like the :ref:`syntax of assembler instructions<amdgpu_syn_instructions>`,
+except that instead of real operands and modifiers, it uses references to their descriptions.
 
 Instructions
 ============
@@ -23,7 +23,9 @@ Notation
 
 This is the notation used to describe AMDGPU instructions:
 
-    ``<``\ :ref:`opcode description<amdgpu_syn_opcode_notation>`\ ``>  <``\ :ref:`operands description<amdgpu_syn_instruction_operands_notation>`\ ``>  <``\ :ref:`modifiers description<amdgpu_syn_instruction_modifiers_notation>`\ ``>``
+  | ``<``\ :ref:`opcode description<amdgpu_syn_opcode_notation>`\ ``>
+      <``\ :ref:`operands description<amdgpu_syn_instruction_operands_notation>`\ ``>
+      <``\ :ref:`modifiers description<amdgpu_syn_instruction_modifiers_notation>`\ ``>``
 
 .. _amdgpu_syn_opcode_notation:
 
@@ -42,7 +44,8 @@ Operands
 
 An instruction may have zero or more *operands*. They are comma-separated in the description:
 
-    ``<``\ :ref:`description of operand 0<amdgpu_syn_instruction_operand_notation>`\ ``>, <``\ :ref:`description of operand 1<amdgpu_syn_instruction_operand_notation>`\ ``>, ...``
+  | ``<``\ :ref:`description of operand 0<amdgpu_syn_instruction_operand_notation>`\ ``>,
+      <``\ :ref:`description of operand 1<amdgpu_syn_instruction_operand_notation>`\ ``>, ...``
 
 The order of *operands* is fixed. *Operands* cannot be omitted
 except for special cases described below.
@@ -60,7 +63,8 @@ Where:
 
 * *kind* is an optional prefix describing operand :ref:`kind<amdgpu_syn_instruction_operand_kinds>`.
 * *name* is a link to a description of the operand.
-* *tags* are optional. They are used to indicate :ref:`special operand properties<amdgpu_syn_instruction_operand_tags>`.
+* *tags* are optional. They are used to indicate
+  :ref:`special operand properties<amdgpu_syn_instruction_operand_tags>`.
 
 .. _amdgpu_syn_instruction_operand_kinds:
 
@@ -70,8 +74,8 @@ Operand Kinds
 Operand kind indicates which values are accepted by the operand.
 
 * Operands which only accept *vector* registers are labelled with 'v' prefix.
-* Operands which only accept *scalar* values are labelled with 's' prefix.
-* Operands which accept both *vector* registers and *scalar* values have no prefix.
+* Operands which only accept *scalar* registers and values are labelled with 's' prefix.
+* Operands which accept any registers and values have no prefix.
 
 Examples:
 
@@ -79,7 +83,7 @@ Examples:
 
     vdata          // operand only accepts vector registers
     sdst           // operand only accepts scalar registers
-    src1           // operand accepts both scalar and vector registers
+    src1           // operand accepts vector registers, scalar registers, and scalar values
 
 .. _amdgpu_syn_instruction_operand_tags:
 
@@ -92,16 +96,16 @@ Operand tags indicate special operand properties.
     Operand tag    Meaning
     ============== =================================================================================
     :opt           An optional operand.
-    :m             An operand which may be used with
-                   :ref:`VOP3 operand modifiers<amdgpu_synid_vop3_operand_modifiers>` or
-                   :ref:`SDWA operand modifiers<amdgpu_synid_sdwa_operand_modifiers>`.
-    :dst           An input operand which may also serve as a destination
+    :m             An operand which may be used with operand modifiers
+                   :ref:`abs<amdgpu_synid_abs>`, :ref:`neg<amdgpu_synid_neg>` or
+                   :ref:`sext<amdgpu_synid_sext>`.
+    :dst           An input operand which is also used as a destination
                    if :ref:`glc<amdgpu_synid_glc>` modifier is specified.
-    :fx            This is an *f32* or *f16* operand depending on
+    :fx            This is a *f32* or *f16* operand, depending on
                    :ref:`m_op_sel_hi<amdgpu_synid_mad_mix_op_sel_hi>` modifier.
-    :<type>        Operand *type* 
diff ers from *type*
+    :<type>        The operand *type* 
diff ers from the *type*
                    :ref:`implied by the opcode name<amdgpu_syn_instruction_type>`.
-                   This tag specifies actual operand *type*.
+                   This tag specifies the actual operand *type*.
     ============== =================================================================================
 
 Examples:
@@ -119,7 +123,8 @@ Modifiers
 
 An instruction may have zero or more optional *modifiers*. They are space-separated in the description:
 
-    ``<``\ :ref:`description of modifier 0<amdgpu_syn_instruction_modifier_notation>`\ ``> <``\ :ref:`description of modifier 1<amdgpu_syn_instruction_modifier_notation>`\ ``> ...``
+  | ``<``\ :ref:`description of modifier 0<amdgpu_syn_instruction_modifier_notation>`\ ``>
+      <``\ :ref:`description of modifier 1<amdgpu_syn_instruction_modifier_notation>`\ ``> ...``
 
 The order of *modifiers* is fixed.
 
@@ -132,4 +137,4 @@ A *modifier* is described using the following notation:
 
     *<name>*
 
-Where *name* is a link to a description of the *modifier*.
+Where the *name* is a link to a description of the *modifier*.

diff  --git a/llvm/docs/AMDGPUInstructionSyntax.rst b/llvm/docs/AMDGPUInstructionSyntax.rst
index b4c984745b649..1f5483ed6fc15 100644
--- a/llvm/docs/AMDGPUInstructionSyntax.rst
+++ b/llvm/docs/AMDGPUInstructionSyntax.rst
@@ -15,9 +15,10 @@ Syntax
 
 An instruction has the following syntax:
 
-    ``<``\ *opcode mnemonic*\ ``>    <``\ *operand0*\ ``>, <``\ *operand1*\ ``>,...    <``\ *modifier0*\ ``> <``\ *modifier1*\ ``>...``
+  | ``<``\ *opcode mnemonic*\ ``>    <``\ *operand0*\ ``>,
+      <``\ *operand1*\ ``>,...    <``\ *modifier0*\ ``> <``\ *modifier1*\ ``>...``
 
-:doc:`Operands<AMDGPUOperandSyntax>` are normally comma-separated while
+:doc:`Operands<AMDGPUOperandSyntax>` are normally comma-separated, while
 :doc:`modifiers<AMDGPUModifierSyntax>` are space-separated.
 
 The order of *operands* and *modifiers* is fixed.
@@ -28,7 +29,8 @@ Most *modifiers* are optional and may be omitted.
 Opcode Mnemonic
 ~~~~~~~~~~~~~~~
 
-Opcode mnemonic describes opcode semantics and may include one or more suffices in this order:
+Opcode mnemonic describes opcode semantics
+and may include one or more suffices in this order:
 
 * :ref:`Packing suffix<amdgpu_syn_instruction_pk>`.
 * :ref:`Destination operand type suffix<amdgpu_syn_instruction_type>`.
@@ -81,7 +83,7 @@ The following table enumerates the most frequently used type suffices.
     ============================================ ======================= ============================
 
 Instructions which have no type suffices are assumed to operate with typeless data.
-The size of data is specified by size suffices:
+The size of typeless data is specified by size suffices:
 
     ================= =================== =====================================
     Size Suffix       Implied data type   Required register size in dwords
@@ -103,8 +105,8 @@ The size of data is specified by size suffices:
     ================= =================== =====================================
 
 .. WARNING::
-    There are exceptions from rules described above.
-    Operands which have type 
diff erent from type specified by the opcode are
+    There are exceptions to the rules described above.
+    Operands which have a type 
diff erent from the type specified by the opcode are
     :ref:`tagged<amdgpu_syn_instruction_operand_tags>` in the description.
 
 Examples of instructions with 
diff erent types of source and destination operands:
@@ -144,7 +146,9 @@ Encoding Suffices
 Most *VOP1*, *VOP2* and *VOPC* instructions have several variants:
 they may also be encoded in *VOP3*, *DPP* and *SDWA* formats.
 
-The assembler will automatically use optimal encoding based on instruction operands.
+The assembler selects an optimal encoding automatically
+based on instruction operands and modifiers,
+unless a specific encoding is explicitly requested.
 To force specific encoding, one can add a suffix to the opcode of the instruction:
 
     =================================================== =================
@@ -156,8 +160,8 @@ To force specific encoding, one can add a suffix to the opcode of the instructio
     *SDWA* encoding                                     _sdwa
     =================================================== =================
 
-These suffices are used in this reference to indicate the assumed encoding.
-When no suffix is specified, native instruction encoding is implied.
+This reference uses encoding suffices to specify which encoding is implied.
+When no suffix is specified, native instruction encoding is assumed.
 
 Operands
 ========
@@ -165,9 +169,9 @@ Operands
 Syntax
 ~~~~~~
 
-Syntax of generic operands is described :doc:`in this document<AMDGPUOperandSyntax>`.
+The syntax of generic operands is described :doc:`in this document<AMDGPUOperandSyntax>`.
 
-For detailed information about operands follow *operand links* in GPU-specific documents.
+For detailed information about operands, follow *operand links* in GPU-specific documents.
 
 Modifiers
 =========
@@ -175,6 +179,7 @@ Modifiers
 Syntax
 ~~~~~~
 
-Syntax of modifiers is described :doc:`in this document<AMDGPUModifierSyntax>`.
+The syntax of modifiers is described :doc:`in this document<AMDGPUModifierSyntax>`.
 
-Information about modifiers supported for individual instructions may be found in GPU-specific documents.
+Information about modifiers supported for individual instructions
+may be found in GPU-specific documents.

diff  --git a/llvm/docs/AMDGPUModifierSyntax.rst b/llvm/docs/AMDGPUModifierSyntax.rst
index 8a30bf1593e2c..dd9cbaa532652 100644
--- a/llvm/docs/AMDGPUModifierSyntax.rst
+++ b/llvm/docs/AMDGPUModifierSyntax.rst
@@ -14,7 +14,7 @@ The following notation is used throughout this document:
     Notation            Description
     =================== =============================================================
     {0..N}              Any integer value in the range from 0 to N (inclusive).
-    <x>                 Syntax and meaning of *x* is explained elsewhere.
+    <x>                 Syntax and meaning of *x* are explained elsewhere.
     =================== =============================================================
 
 .. _amdgpu_syn_modifiers:
@@ -30,7 +30,7 @@ DS Modifiers
 offset0
 ~~~~~~~
 
-Specifies first 8-bit offset, in bytes. The default value is 0.
+Specifies the first 8-bit offset, in bytes. The default value is 0.
 
 Used with DS instructions that expect two addresses.
 
@@ -55,7 +55,7 @@ Examples:
 offset1
 ~~~~~~~
 
-Specifies second 8-bit offset, in bytes. The default value is 0.
+Specifies the second 8-bit offset, in bytes. The default value is 0.
 
 Used with DS instructions that expect two addresses.
 
@@ -105,11 +105,9 @@ Examples:
 swizzle pattern
 ~~~~~~~~~~~~~~~
 
-This is a special modifier which may be used with *ds_swizzle_b32* instruction only.
+This is a special modifier that may be used with *ds_swizzle_b32* instruction only.
 It specifies a swizzle pattern in numeric or symbolic form. The default value is 0.
 
-See AMD documentation for more information.
-
     ======================================================= ===========================================================
     Syntax                                                  Description
     ======================================================= ===========================================================
@@ -122,7 +120,7 @@ See AMD documentation for more information.
                                                             The pattern converts a 5-bit lane *id* to another
                                                             lane *id* with which the lane interacts.
 
-                                                            *mask* is a 5 character sequence which
+                                                            The *mask* is a 5-character sequence which
                                                             specifies how to transform the bits of the
                                                             lane *id*.
 
@@ -145,7 +143,7 @@ See AMD documentation for more information.
                                                             size and must be equal to 2, 4, 8, 16 or 32.
 
                                                             The second numeric parameter is an index of the
-                                                            lane being broadcasted.
+                                                            lane being broadcast.
 
                                                             The index must not exceed group size.
     offset:swizzle(SWAP,{1..16})                            Specifies a swap mode.
@@ -157,7 +155,8 @@ See AMD documentation for more information.
                                                             Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes.
     ======================================================= ===========================================================
 
-Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
+Note: numeric values may be specified as either
+:ref:`integer numbers<amdgpu_synid_integer_number>` or
 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
 
 Examples:
@@ -195,7 +194,7 @@ done
 ~~~~
 
 Specifies if this is the last export from the shader to the target. By default,
-*exp* instruction does not finish an export sequence.
+an *export* instruction does not finish an export sequence.
 
     ======================================== ================================================
     Syntax                                   Description
@@ -208,12 +207,12 @@ Specifies if this is the last export from the shader to the target. By default,
 compr
 ~~~~~
 
-Indicates if the data are compressed (data are not compressed by default).
+Indicates if the data is compressed (data is not compressed by default).
 
     ======================================== ================================================
     Syntax                                   Description
     ======================================== ================================================
-    compr                                    Data are compressed.
+    compr                                    Data is compressed.
     ======================================== ================================================
 
 .. _amdgpu_synid_vm:
@@ -221,12 +220,14 @@ Indicates if the data are compressed (data are not compressed by default).
 vm
 ~~
 
-Specifies valid mask flag state (off by default).
+Specifies if the :ref:`exec<amdgpu_synid_exec>` mask is valid for this *export* instruction
+(the mask is not valid by default).
 
     ======================================== ================================================
     Syntax                                   Description
     ======================================== ================================================
-    vm                                       Set valid mask flag.
+    vm                                       Set the flag indicating a valid
+                                             :ref:`exec<amdgpu_synid_exec>` mask.
     ======================================== ================================================
 
 FLAT Modifiers
@@ -239,8 +240,6 @@ offset12
 
 Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
 
-Cannot be used with *global/scratch* opcodes. GFX9 only.
-
     ================= ====================================================================
     Syntax            Description
     ================= ====================================================================
@@ -263,8 +262,6 @@ offset13s
 
 Specifies an immediate signed 13-bit offset, in bytes. The default value is 0.
 
-Can be used with *global/scratch* opcodes only. GFX9 only.
-
     ===================== ====================================================================
     Syntax                Description
     ===================== ====================================================================
@@ -288,10 +285,6 @@ offset12s
 
 Specifies an immediate signed 12-bit offset, in bytes. The default value is 0.
 
-Can be used with *global/scratch* opcodes only.
-
-GFX10 only.
-
     ===================== ====================================================================
     Syntax                Description
     ===================== ====================================================================
@@ -315,10 +308,6 @@ offset11
 
 Specifies an immediate unsigned 11-bit offset, in bytes. The default value is 0.
 
-Cannot be used with *global/scratch* opcodes.
-
-GFX10 only.
-
     ================= ====================================================================
     Syntax            Description
     ================= ====================================================================
@@ -337,7 +326,7 @@ Examples:
 dlc
 ~~~
 
-See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
+See a description :ref:`here<amdgpu_synid_dlc>`.
 
 glc
 ~~~
@@ -347,7 +336,7 @@ See a description :ref:`here<amdgpu_synid_glc>`.
 lds
 ~~~
 
-See a description :ref:`here<amdgpu_synid_lds>`. GFX10 only.
+See a description :ref:`here<amdgpu_synid_lds>`.
 
 slc
 ~~~
@@ -387,8 +376,8 @@ MIMG Modifiers
 dmask
 ~~~~~
 
-Specifies which channels (image components) are used by the operation. By default, no channels
-are used.
+Specifies which channels (image components) are used by the operation.
+By default, no channels are used.
 
     =============== ====================================================================
     Syntax          Description
@@ -399,11 +388,11 @@ are used.
 
                     Each bit corresponds to one of 4 image components (RGBA).
 
-                    If the specified bit value is 0, the component is not used,
-                    value 1 means that the component is used.
+                    If the specified bit value is 0, the image component is not used,
+                    while value 1 means that the component is used.
     =============== ====================================================================
 
-This modifier has some limitations depending on instruction kind:
+This modifier has some limitations depending on the instruction kind:
 
     =================================================== ========================
     Instruction Kind                                    Valid dmask Values
@@ -434,7 +423,7 @@ Specifies whether the address is normalized or not (the address is normalized by
     ======================== ========================================
     Syntax                   Description
     ======================== ========================================
-    unorm                    Force the address to be unnormalized.
+    unorm                    Force the address to be not normalized.
     ======================== ========================================
 
 glc
@@ -454,15 +443,14 @@ r128
 
 Specifies texture resource size. The default size is 256 bits.
 
-GFX7, GFX8 and GFX10 only.
-
     =================== ================================================
     Syntax              Description
     =================== ================================================
     r128                Specifies 128 bits texture resource size.
     =================== ================================================
 
-.. WARNING:: Using this modifier should decrease *rsrc* operand size from 8 to 4 dwords, but assembler does not currently support this feature.
+.. WARNING:: Using this modifier shall decrease *rsrc* operand size from 8 to 4 dwords, \
+             but assembler does not currently support this feature.
 
 tfe
 ~~~
@@ -487,12 +475,12 @@ Specifies LOD warning status (LOD warning is disabled by default).
 da
 ~~
 
-Specifies if an array index must be sent to TA. By default, array index is not sent.
+Specifies if an array index must be sent to TA. By default, the array index is not sent.
 
     ======================================== ================================================
     Syntax                                   Description
     ======================================== ================================================
-    da                                       Send an array-index to TA.
+    da                                       Send an array index to TA.
     ======================================== ================================================
 
 .. _amdgpu_synid_d16:
@@ -500,7 +488,7 @@ Specifies if an array index must be sent to TA. By default, array index is not s
 d16
 ~~~
 
-Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7.
+Specifies data size: 16 or 32 bits (32 bits by default).
 
     ======================================== ================================================
     Syntax                                   Description
@@ -511,12 +499,12 @@ Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7.
                                              format before storing it in VGPRs.
 
                                              For stores, convert 16-bit data in VGPRs to
-                                             32 bits before going to memory.
+                                             32 bits before writing the values to memory.
 
                                              Note that GFX8.0 does not support data packing.
                                              Each 16-bit data element occupies 1 VGPR.
 
-                                             GFX8.1, GFX9 and GFX10 support data packing.
+                                             GFX8.1 and GFX9+ support data packing.
                                              Each pair of 16-bit data elements
                                              occupies 1 VGPR.
     ======================================== ================================================
@@ -526,8 +514,7 @@ Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7.
 a16
 ~~~
 
-Specifies size of image address components: 16 or 32 bits (32 bits by default).
-GFX9 and GFX10 only.
+Specifies the size of image address components: 16 or 32 bits (32 bits by default).
 
     ======================================== ================================================
     Syntax                                   Description
@@ -542,8 +529,6 @@ dim
 
 Specifies surface dimension. This is a mandatory modifier. There is no default value.
 
-GFX10 only.
-
     =============================== =========================================================
     Syntax                          Description
     =============================== =========================================================
@@ -576,7 +561,7 @@ for compatibility with SP3 assembler:
 dlc
 ~~~
 
-See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
+See a description :ref:`here<amdgpu_synid_dlc>`.
 
 Miscellaneous Modifiers
 -----------------------
@@ -587,11 +572,9 @@ dlc
 ~~~
 
 Controls device level cache policy for memory operations. Used for synchronization.
-When specified, forces operation to bypass device level cache making the operation device
+When specified, forces operation to bypass device level cache, making the operation device
 level coherent. By default, instructions use device level cache.
 
-GFX10 only.
-
     ======================================== ================================================
     Syntax                                   Description
     ======================================== ================================================
@@ -603,10 +586,11 @@ GFX10 only.
 glc
 ~~~
 
-This modifier has 
diff erent meaning for loads, stores, and atomic operations.
-The default value is off (0).
+For atomic opcodes, this modifier indicates that the instruction returns the value from memory
+before the operation. For other opcodes, it is used together with :ref:`slc<amdgpu_synid_slc>`
+to specify cache policy.
 
-See AMD documentation for details.
+The default value is off (0).
 
     ======================================== ================================================
     Syntax                                   Description
@@ -624,7 +608,7 @@ Specifies where to store the result: VGPRs or LDS (VGPRs by default).
     ======================================== ===========================
     Syntax                                   Description
     ======================================== ===========================
-    lds                                      Store result in LDS.
+    lds                                      Store the result in LDS.
     ======================================== ===========================
 
 .. _amdgpu_synid_nv:
@@ -632,14 +616,13 @@ Specifies where to store the result: VGPRs or LDS (VGPRs by default).
 nv
 ~~
 
-Specifies if instruction is operating on non-volatile memory. By default, memory is volatile.
-
-GFX9 only.
+Specifies if the instruction is operating on non-volatile memory.
+By default, memory is volatile.
 
     ======================================== ================================================
     Syntax                                   Description
     ======================================== ================================================
-    nv                                       Indicates that instruction operates on
+    nv                                       Indicates that the instruction operates on
                                              non-volatile memory.
     ======================================== ================================================
 
@@ -648,9 +631,7 @@ GFX9 only.
 slc
 ~~~
 
-Specifies cache policy. The default value is off (0).
-
-See AMD documentation for details.
+Controls behavior of L2 cache. The default value is off (0).
 
     ======================================== ================================================
     Syntax                                   Description
@@ -665,8 +646,6 @@ tfe
 
 Controls access to partially resident textures. The default value is off (0).
 
-See AMD documentation for details.
-
     ======================================== ================================================
     Syntax                                   Description
     ======================================== ================================================
@@ -678,9 +657,9 @@ See AMD documentation for details.
 sc0
 ~~~
 
-For atomics, sc0 indicates that the atomic operation returns a value.
-For other opcodes is is used together with :ref:`sc1<amdgpu_synid_sc1>` to specify cache
-policy. See AMD documentation for details.
+For atomic opcodes, this modifier indicates that the instruction returns the value from memory
+before the operation. For other opcodes, it is used together with :ref:`sc1<amdgpu_synid_sc1>`
+to specify cache policy.
 
     ======================================== ================================================
     Syntax                                   Description
@@ -723,9 +702,9 @@ MUBUF/MTBUF Modifiers
 idxen
 ~~~~~
 
-Specifies whether address components include an index. By default, no components are used.
+Specifies whether address components include an index. By default, the index is not used.
 
-Can be used together with :ref:`offen<amdgpu_synid_offen>`.
+May be used together with :ref:`offen<amdgpu_synid_offen>`.
 
 Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
 
@@ -740,9 +719,9 @@ Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
 offen
 ~~~~~
 
-Specifies whether address components include an offset. By default, no components are used.
+Specifies whether address components include an offset. By default, the offset is not used.
 
-Can be used together with :ref:`idxen<amdgpu_synid_idxen>`.
+May be used together with :ref:`idxen<amdgpu_synid_idxen>`.
 
 Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
 
@@ -759,7 +738,7 @@ addr64
 
 Specifies whether a 64-bit address is used. By default, no address is used.
 
-GFX7 only. Cannot be used with :ref:`offen<amdgpu_synid_offen>` and
+Cannot be used with :ref:`offen<amdgpu_synid_offen>` and
 :ref:`idxen<amdgpu_synid_idxen>` modifiers.
 
     ======================================== ================================================
@@ -808,7 +787,7 @@ See a description :ref:`here<amdgpu_synid_lds>`.
 dlc
 ~~~
 
-See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
+See a description :ref:`here<amdgpu_synid_dlc>`.
 
 tfe
 ~~~
@@ -827,15 +806,15 @@ The default data format is BUF_DATA_FORMAT_8.
     ========================================= ===============================================================
     Syntax                                    Description
     ========================================= ===============================================================
-    format:{0..127}                           Use format specified as either an
+    format:{0..127}                           Use a format specified as either an
                                               :ref:`integer number<amdgpu_synid_integer_number>` or an
                                               :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
     format:[<data format>]                    Use the specified data format and
                                               default numeric format.
     format:[<numeric format>]                 Use the specified numeric format and
                                               default data format.
-    format:[<data format>, <numeric format>]  Use the specified data and numeric formats.
-    format:[<numeric format>, <data format>]  Use the specified data and numeric formats.
+    format:[<data format>,<numeric format>]   Use the specified data and numeric formats.
+    format:[<numeric format>,<data format>]   Use the specified data and numeric formats.
     ========================================= ===============================================================
 
 .. _amdgpu_synid_format_data:
@@ -846,7 +825,7 @@ Supported data formats are defined in the following table:
     Syntax                                    Note
     ========================================= ===============================
     BUF_DATA_FORMAT_INVALID
-    BUF_DATA_FORMAT_8                         Default value.
+    BUF_DATA_FORMAT_8                         The default value.
     BUF_DATA_FORMAT_16
     BUF_DATA_FORMAT_8_8
     BUF_DATA_FORMAT_32
@@ -870,7 +849,7 @@ Supported numeric formats are defined below:
     ========================================= ===============================
     Syntax                                    Note
     ========================================= ===============================
-    BUF_NUM_FORMAT_UNORM                      Default value.
+    BUF_NUM_FORMAT_UNORM                      The default value.
     BUF_NUM_FORMAT_SNORM
     BUF_NUM_FORMAT_USCALED
     BUF_NUM_FORMAT_SSCALED
@@ -898,29 +877,28 @@ ufmt
 
 Specifies a unified format used by the operation.
 The default format is BUF_FMT_8_UNORM.
-GFX10 only.
 
     ========================================= ===============================================================
     Syntax                                    Description
     ========================================= ===============================================================
-    format:{0..127}                           Use unified format specified as either an
+    format:{0..127}                           Use a unified format specified as either an
                                               :ref:`integer number<amdgpu_synid_integer_number>` or an
                                               :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
-                                              Note that unified format numbers are not compatible with
+                                              Note that unified format numbers are incompatible with
                                               format numbers used for pre-GFX10 ISA.
     format:[<unified format>]                 Use the specified unified format.
     ========================================= ===============================================================
 
 Unified format is a replacement for :ref:`data<amdgpu_synid_format_data>`
 and :ref:`numeric<amdgpu_synid_format_num>` formats. For compatibility with older ISA,
-:ref:`syntax with data and numeric formats<amdgpu_synid_fmt>` is still accepted
+:ref:`the syntax with data and numeric formats<amdgpu_synid_fmt>` is still accepted
 provided that the combination of formats can be mapped to a unified format.
 
 Supported unified formats and equivalent combinations of data and numeric formats
 are defined below:
 
     ============================== ============================== =============================
-    Syntax                         Equivalent Data Format         Equivalent Numeric Format
+    Unified Format Syntax          Equivalent Data Format         Equivalent Numeric Format
     ============================== ============================== =============================
     BUF_FMT_INVALID                BUF_DATA_FORMAT_INVALID        BUF_NUM_FORMAT_UNORM
 
@@ -1033,12 +1011,12 @@ See a description :ref:`here<amdgpu_synid_glc>`.
 nv
 ~~
 
-See a description :ref:`here<amdgpu_synid_nv>`. GFX9 only.
+See a description :ref:`here<amdgpu_synid_nv>`.
 
 dlc
 ~~~
 
-See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
+See a description :ref:`here<amdgpu_synid_dlc>`.
 
 .. _amdgpu_synid_smem_offset20u:
 
@@ -1095,19 +1073,16 @@ high
 ~~~~
 
 Specifies which half of the LDS word to use. Low half of LDS word is used by default.
-GFX9 and GFX10 only.
 
     ======================================== ================================
     Syntax                                   Description
     ======================================== ================================
-    high                                     Use high half of LDS word.
+    high                                     Use the high half of LDS word.
     ======================================== ================================
 
 DPP8 Modifiers
 --------------
 
-GFX10 only.
-
 .. _amdgpu_synid_dpp8_sel:
 
 dpp8_sel
@@ -1116,11 +1091,9 @@ dpp8_sel
 Selects which lanes to pull data from, within a group of 8 lanes. This is a mandatory modifier.
 There is no default value.
 
-GFX10 only.
-
 The *dpp8_sel* modifier must specify exactly 8 values.
-First value selects which lane to read from to supply data into lane 0.
-Second value controls lane 1 and so on.
+The first value selects which lane to read from to supply data into lane 0.
+The second value controls lane 1 and so on.
 
 Each value may be specified as either
 an :ref:`integer number<amdgpu_synid_integer_number>` or
@@ -1148,42 +1121,37 @@ Controls interaction with inactive lanes for *dpp8* instructions. The default va
 
 Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
 
-GFX10 only.
-
     ==================================== =====================================================
     Syntax                               Description
     ==================================== =====================================================
     fi:0                                 Fetch zero when accessing data from inactive lanes.
-    fi:1                                 Fetch pre-exist values from inactive lanes.
+    fi:1                                 Fetch pre-existing values from inactive lanes.
     ==================================== =====================================================
 
-Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
+Note: numeric values may be specified as either
+:ref:`integer numbers<amdgpu_synid_integer_number>` or
 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
 
 DPP Modifiers
 -------------
 
-GFX8, GFX9 and GFX10 only.
-
 .. _amdgpu_synid_dpp_ctrl:
 
 dpp_ctrl
 ~~~~~~~~
 
-Specifies how data are shared between threads. This is a mandatory modifier.
+Specifies how data is shared between threads. This is a mandatory modifier.
 There is no default value.
 
-GFX8 and GFX9 only. Use :ref:`dpp16_ctrl<amdgpu_synid_dpp16_ctrl>` for GFX10.
-
 Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
 
-    ======================================== ================================================
+    ======================================== ========================================================
     Syntax                                   Description
-    ======================================== ================================================
+    ======================================== ========================================================
     quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
     row_mirror                               Mirror threads within row.
     row_half_mirror                          Mirror threads within 1/2 row (8 threads).
-    row_bcast:15                             Broadcast 15th thread of each row to next row.
+    row_bcast:15                             Broadcast the 15th thread of each row to the next row.
     row_bcast:31                             Broadcast thread 31 to rows 2 and 3.
     wave_shl:1                               Wavefront left shift by 1 thread.
     wave_rol:1                               Wavefront left rotate by 1 thread.
@@ -1192,7 +1160,7 @@ Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
     row_shl:{1..15}                          Row shift left by 1-15 threads.
     row_shr:{1..15}                          Row shift right by 1-15 threads.
     row_ror:{1..15}                          Row rotate right by 1-15 threads.
-    ======================================== ================================================
+    ======================================== ========================================================
 
 Note: numeric values may be specified as either
 :ref:`integer numbers<amdgpu_synid_integer_number>` or
@@ -1210,27 +1178,25 @@ Examples:
 dpp16_ctrl
 ~~~~~~~~~~
 
-Specifies how data are shared between threads. This is a mandatory modifier.
+Specifies how data is shared between threads. This is a mandatory modifier.
 There is no default value.
 
-GFX10 only. Use :ref:`dpp_ctrl<amdgpu_synid_dpp_ctrl>` for GFX8 and GFX9.
-
 Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
 (There are only two rows in *wave32* mode.)
 
-    ======================================== ====================================================
+    ======================================== =======================================================
     Syntax                                   Description
-    ======================================== ====================================================
+    ======================================== =======================================================
     quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
     row_mirror                               Mirror threads within row.
     row_half_mirror                          Mirror threads within 1/2 row (8 threads).
     row_share:{0..15}                        Share the value from the specified lane with other
                                              lanes in the row.
-    row_xmask:{0..15}                        Fetch from XOR(current lane id, specified lane id).
+    row_xmask:{0..15}                        Fetch from XOR(<current lane id>,<specified lane id>).
     row_shl:{1..15}                          Row shift left by 1-15 threads.
     row_shr:{1..15}                          Row shift right by 1-15 threads.
     row_ror:{1..15}                          Row rotate right by 1-15 threads.
-    ======================================== ====================================================
+    ======================================== =======================================================
 
 Note: numeric values may be specified as either
 :ref:`integer numbers<amdgpu_synid_integer_number>` or
@@ -1248,20 +1214,18 @@ Examples:
 dpp32_ctrl
 ~~~~~~~~~~
 
-Specifies how data are shared between threads. This is a mandatory modifier.
+Specifies how data is shared between threads. This is a mandatory modifier.
 There is no default value.
 
-May be used only with GFX90A 32-bit instructions.
-
 Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
 
-    ======================================== ==================================================
+    ======================================== =========================================================
     Syntax                                   Description
-    ======================================== ==================================================
+    ======================================== =========================================================
     quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
     row_mirror                               Mirror threads within row.
     row_half_mirror                          Mirror threads within 1/2 row (8 threads).
-    row_bcast:15                             Broadcast 15th thread of each row to next row.
+    row_bcast:15                             Broadcast the 15th thread of each row to the next row.
     row_bcast:31                             Broadcast thread 31 to rows 2 and 3.
     wave_shl:1                               Wavefront left shift by 1 thread.
     wave_rol:1                               Wavefront left rotate by 1 thread.
@@ -1271,7 +1235,7 @@ Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
     row_shr:{1..15}                          Row shift right by 1-15 threads.
     row_ror:{1..15}                          Row rotate right by 1-15 threads.
     row_newbcast:{1..15}                     Broadcast a thread within a row to the whole row.
-    ======================================== ==================================================
+    ======================================== =========================================================
 
 Note: numeric values may be specified as either
 :ref:`integer numbers<amdgpu_synid_integer_number>` or
@@ -1290,11 +1254,9 @@ Examples:
 dpp64_ctrl
 ~~~~~~~~~~
 
-Specifies how data are shared between threads. This is a mandatory modifier.
+Specifies how data is shared between threads. This is a mandatory modifier.
 There is no default value.
 
-May be used only with GFX90A 64-bit instructions.
-
 Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
 
     ======================================== ==================================================
@@ -1331,10 +1293,10 @@ Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
                       :ref:`integer number <amdgpu_synid_integer_number>`
                       or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 
-                      Each of 4 bits in the mask controls one row
+                      Each of the 4 bits in the mask controls one row
                       (0 - disabled, 1 - enabled).
 
-                      In *wave32* mode the values should be limited to 0..7.
+                      In *wave32* mode, the values shall be limited to {0..7}.
     ================= ====================================================================
 
 Examples:
@@ -1362,7 +1324,7 @@ Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
                        :ref:`integer number <amdgpu_synid_integer_number>`
                        or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 
-                       Each of 4 bits in the mask controls one bank
+                       Each of the 4 bits in the mask controls one bank
                        (0 - disabled, 1 - enabled).
     ================== ====================================================================
 
@@ -1391,6 +1353,8 @@ invalid lanes is disabled.
                                              return zero.
     ======================================== ================================================
 
+.. WARNING:: For historical reasons, *bound_ctrl:0* has the same meaning as *bound_ctrl:1*.
+
 .. _amdgpu_synid_fi16:
 
 fi
@@ -1400,25 +1364,22 @@ Controls interaction with *inactive* lanes for *dpp16* instructions. The default
 
 Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
 
-GFX10 only.
-
     ======================================== ==================================================
     Syntax                                   Description
     ======================================== ==================================================
     fi:0                                     Interaction with inactive lanes is controlled by
                                              :ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`.
 
-    fi:1                                     Fetch pre-exist values from inactive lanes.
+    fi:1                                     Fetch pre-existing values from inactive lanes.
     ======================================== ==================================================
 
-Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
+Note: numeric values may be specified as either
+:ref:`integer numbers<amdgpu_synid_integer_number>` or
 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
 
 SDWA Modifiers
 --------------
 
-GFX8, GFX9 and GFX10 only.
-
 clamp
 ~~~~~
 
@@ -1429,8 +1390,6 @@ omod
 
 See a description :ref:`here<amdgpu_synid_omod>`.
 
-GFX9 and GFX10 only.
-
 .. _amdgpu_synid_dst_sel:
 
 dst_sel
@@ -1512,8 +1471,6 @@ SDWA Operand Modifiers
 
 Operand modifiers are not used separately. They are applied to source operands.
 
-GFX8, GFX9 and GFX10 only.
-
 abs
 ~~~
 
@@ -1529,8 +1486,7 @@ See a description :ref:`here<amdgpu_synid_neg>`.
 sext
 ~~~~
 
-Sign-extends value of a (sub-dword) operand to fill all 32 bits.
-Has no effect for 32-bit operands.
+Sign-extends the value of a (sub-dword) integer operand to fill all 32 bits.
 
 Valid for integer operands only.
 
@@ -1559,15 +1515,13 @@ Selects the low [15:0] or high [31:16] operand bits for source and destination o
 By default, low bits are used for all operands.
 
 The number of values specified with the op_sel modifier must match the number of instruction
-operands (both source and destination). First value controls src0, second value controls src1
+operands (both source and destination). The first value controls src0, the second value controls src1
 and so on, except that the last value controls destination.
 The value 0 selects the low bits, while 1 selects the high bits.
 
-Note: op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified
+Note: op_sel modifier affects 16-bit operands only. For 32-bit operands, the value specified
 by op_sel must be 0.
 
-GFX9 and GFX10 only.
-
     ======================================== ============================================================
     Syntax                                   Description
     ======================================== ============================================================
@@ -1592,18 +1546,16 @@ Examples:
 dpp_op_sel
 ~~~~~~~~~~
 
-Special version of *op_sel* used for *permlane* opcodes to specify
+This is a special version of *op_sel* used for *permlane* opcodes to specify
 dpp-like mode bits - :ref:`fi<amdgpu_synid_fi16>` and
 :ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`.
 
-GFX10 only.
-
-    ======================================== ============================================================
+    ======================================== =================================================================
     Syntax                                   Description
-    ======================================== ============================================================
-    op_sel:[{0..1},{0..1}]                   First bit specifies :ref:`fi<amdgpu_synid_fi16>`, second
+    ======================================== =================================================================
+    op_sel:[{0..1},{0..1}]                   The first bit specifies :ref:`fi<amdgpu_synid_fi16>`, the second
                                              bit specifies :ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`.
-    ======================================== ============================================================
+    ======================================== =================================================================
 
 Note: numeric values may be specified as either
 :ref:`integer numbers<amdgpu_synid_integer_number>` or
@@ -1623,14 +1575,12 @@ clamp
 Clamp meaning depends on instruction.
 
 For *v_cmp* instructions, clamp modifier indicates that the compare signals
-if a floating point exception occurs. By default, signaling is disabled.
-Not supported by GFX7.
+if a floating-point exception occurs. By default, signaling is disabled.
 
 For integer operations, clamp modifier indicates that the result must be clamped
 to the largest and smallest representable value. By default, there is no clamping.
-Integer clamping is not supported by GFX7.
 
-For floating point operations, clamp modifier indicates that the result must be clamped
+For floating-point operations, clamp modifier indicates that the result must be clamped
 to the range [0.0, 1.0]. By default, there is no clamping.
 
 Note: clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any).
@@ -1647,16 +1597,12 @@ omod
 ~~~~
 
 Specifies if an output modifier must be applied to the result.
+It is assumed that the result is a floating-point number.
+
 By default, no output modifiers are applied.
 
 Note: output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any).
 
-Output modifiers are valid for f32 and f64 floating point results only.
-They must not be used with f16.
-
-Note: *v_cvt_f16_f32* is an exception. This instruction produces f16 result
-but accepts output modifiers.
-
     ======================================== ================================================
     Syntax                                   Description
     ======================================== ================================================
@@ -1665,7 +1611,8 @@ but accepts output modifiers.
     div:2                                    Multiply the result by 0.5.
     ======================================== ================================================
 
-Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
+Note: numeric values may be specified as either
+:ref:`integer numbers<amdgpu_synid_integer_number>` or
 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
 
 Examples:
@@ -1688,7 +1635,7 @@ abs
 ~~~
 
 Computes the absolute value of its operand. Must be applied before :ref:`neg<amdgpu_synid_neg>`
-(if any). Valid for floating point operands only.
+(if any). Valid for floating-point operands only.
 
     ======================================== ====================================================
     Syntax                                   Description
@@ -1698,7 +1645,7 @@ Computes the absolute value of its operand. Must be applied before :ref:`neg<amd
     ======================================== ====================================================
 
 Note: avoid using SP3 syntax with operands specified as expressions because the trailing '|'
-may be misinterpreted. Such operands should be enclosed into additional parentheses as shown
+may be misinterpreted. Such operands should be enclosed into additional parentheses, as shown
 in examples below.
 
 Examples:
@@ -1716,25 +1663,25 @@ neg
 ~~~
 
 Computes the negative value of its operand. Must be applied after :ref:`abs<amdgpu_synid_abs>`
-(if any). Valid for floating point operands only.
+(if any). Valid for floating-point operands only.
 
     ================== ====================================================
     Syntax             Description
     ================== ====================================================
     neg(<operand>)     Get the negative value of a floating-point operand.
-                       The operand may include an optional
-                       :ref:`abs<amdgpu_synid_abs>` modifier.
+                       An optional :ref:`abs<amdgpu_synid_abs>` modifier
+                       may be applied to the operand before negation.
     -<operand>         The same as above (an SP3 syntax).
     ================== ====================================================
 
 Note: SP3 syntax is supported with limitations because of a potential ambiguity.
-Currently it is allowed in the following cases:
+Currently, it is allowed in the following cases:
 
 * Before a register.
 * Before an :ref:`abs<amdgpu_synid_abs>` modifier.
 * Before an SP3 :ref:`abs<amdgpu_synid_abs>` modifier.
 
-In all other cases "-" is handled as a part of an expression that follows the sign.
+In all other cases, "-" is handled as a part of an expression that follows the sign.
 
 Examples:
 
@@ -1748,7 +1695,7 @@ Examples:
   -abs(v5)
   -\|v5|
 
-  // Operands without negate modifiers
+  // Expressions where "-" has a 
diff erent meaning
   -1
   -x+y
 
@@ -1760,19 +1707,17 @@ This section describes modifiers of *regular* VOP3P instructions.
 *v_mad_mix\** and *v_fma_mix\**
 instructions use these modifiers :ref:`in a special manner<amdgpu_synid_mad_mix>`.
 
-GFX9 and GFX10 only.
-
 .. _amdgpu_synid_op_sel:
 
 op_sel
 ~~~~~~
 
-Selects the low [15:0] or high [31:16] operand bits as input to the operation
+Selects the low [15:0] or high [31:16] operand bits as input to the operation,
 which results in the lower-half of the destination.
-By default, low bits are used for all operands.
+By default, low 16 bits are used for all operands.
 
 The number of values specified by the *op_sel* modifier must match the number of source
-operands. First value controls src0, second value controls src1 and so on.
+operands. The first value controls src0, the second value controls src1 and so on.
 
 The value 0 selects the low bits, while 1 selects the high bits.
 
@@ -1800,12 +1745,12 @@ Examples:
 op_sel_hi
 ~~~~~~~~~
 
-Selects the low [15:0] or high [31:16] operand bits as input to the operation
+Selects the low [15:0] or high [31:16] operand bits as input to the operation,
 which results in the upper-half of the destination.
-By default, high bits are used for all operands.
+By default, high 16 bits are used for all operands.
 
 The number of values specified by the *op_sel_hi* modifier must match the number of source
-operands. First value controls src0, second value controls src1 and so on.
+operands. The first value controls src0, the second value controls src1 and so on.
 
 The value 0 selects the low bits, while 1 selects the high bits.
 
@@ -1833,19 +1778,19 @@ Examples:
 neg_lo
 ~~~~~~
 
-Specifies whether to change sign of operand values selected by
+Specifies whether to change the sign of operand values selected by
 :ref:`op_sel<amdgpu_synid_op_sel>`. These values are then used
-as input to the operation which results in the upper-half of the destination.
+as input to the operation, which results in the upper-half of the destination.
 
 The number of values specified by this modifier must match the number of source
-operands. First value controls src0, second value controls src1 and so on.
+operands. The first value controls src0, the second value controls src1 and so on.
 
 The value 0 indicates that the corresponding operand value is used unmodified,
-the value 1 indicates that negative value of the operand must be used.
+the value 1 indicates that the negative value of the operand must be used.
 
 By default, operand values are used unmodified.
 
-This modifier is valid for floating point operands only.
+This modifier is valid for floating-point operands only.
 
     ================================ ==================================================================
     Syntax                           Description
@@ -1873,17 +1818,17 @@ neg_hi
 
 Specifies whether to change sign of operand values selected by
 :ref:`op_sel_hi<amdgpu_synid_op_sel_hi>`. These values are then used
-as input to the operation which results in the upper-half of the destination.
+as input to the operation, which results in the upper-half of the destination.
 
 The number of values specified by this modifier must match the number of source
-operands. First value controls src0, second value controls src1 and so on.
+operands. The first value controls src0, the second value controls src1 and so on.
 
 The value 0 indicates that the corresponding operand value is used unmodified,
-the value 1 indicates that negative value of the operand must be used.
+the value 1 indicates that the negative value of the operand must be used.
 
 By default, operand values are used unmodified.
 
-This modifier is valid for floating point operands only.
+This modifier is valid for floating-point operands only.
 
     =============================== ==================================================================
     Syntax                          Description
@@ -1920,30 +1865,28 @@ in a manner 
diff erent from *regular* VOP3P instructions.
 
 See a description below.
 
-GFX9 and GFX10 only.
-
 .. _amdgpu_synid_mad_mix_op_sel:
 
 m_op_sel
 ~~~~~~~~
 
-This operand has meaning only for 16-bit source operands as indicated by
+This operand has meaning only for 16-bit source operands, as indicated by
 :ref:`m_op_sel_hi<amdgpu_synid_mad_mix_op_sel_hi>`.
 It specifies to select either the low [15:0] or high [31:16] operand bits
 as input to the operation.
 
 The number of values specified by the *op_sel* modifier must match the number of source
-operands. First value controls src0, second value controls src1 and so on.
+operands. The first value controls src0, the second value controls src1 and so on.
 
 The value 0 indicates the low bits, the value 1 indicates the high 16 bits.
 
 By default, low bits are used for all operands.
 
-    =============================== ================================================
+    =============================== ===================================================
     Syntax                          Description
-    =============================== ================================================
-    op_sel:[{0..1},{0..1},{0..1}]   Select location of each 16-bit source operand.
-    =============================== ================================================
+    =============================== ===================================================
+    op_sel:[{0..1},{0..1},{0..1}]   Select the location of each 16-bit source operand.
+    =============================== ===================================================
 
 Note: numeric values may be specified as either
 :ref:`integer numbers<amdgpu_synid_integer_number>` or
@@ -1964,18 +1907,18 @@ Selects the size of source operands: either 32 bits or 16 bits.
 By default, 32 bits are used for all source operands.
 
 The number of values specified by the *op_sel_hi* modifier must match the number of source
-operands. First value controls src0, second value controls src1 and so on.
+operands. The first value controls src0, the second value controls src1 and so on.
 
 The value 0 indicates 32 bits, the value 1 indicates 16 bits.
 
 The location of 16 bits in the operand may be specified by
 :ref:`m_op_sel<amdgpu_synid_mad_mix_op_sel>`.
 
-    ======================================== ====================================
+    ======================================== ========================================
     Syntax                                   Description
-    ======================================== ====================================
-    op_sel_hi:[{0..1},{0..1},{0..1}]         Select size of each source operand.
-    ======================================== ====================================
+    ======================================== ========================================
+    op_sel_hi:[{0..1},{0..1},{0..1}]         Select the size of each source operand.
+    ======================================== ========================================
 
 Note: numeric values may be specified as either
 :ref:`integer numbers<amdgpu_synid_integer_number>` or
@@ -2005,8 +1948,6 @@ See a description :ref:`here<amdgpu_synid_clamp>`.
 VOP3P MFMA Modifiers
 --------------------
 
-These modifiers may only be used with GFX908 and GFX90A.
-
 .. _amdgpu_synid_cbsz:
 
 cbsz
@@ -2065,15 +2006,13 @@ neg
 
 Indicates operands that must be negated before the operation.
 The number of values specified by this modifier must match the number of source
-operands. First value controls src0, second value controls src1 and so on.
+operands. The first value controls src0, the second value controls src1 and so on.
 
 The value 0 indicates that the corresponding operand value is used unmodified,
 the value 1 indicates that the operand value must be negated before the operation.
 
 By default, operand values are used unmodified.
 
-This modifier is valid for floating point operands only.
-
     =============================== ==================================================================
     Syntax                          Description
     =============================== ==================================================================

diff  --git a/llvm/docs/AMDGPUOperandSyntax.rst b/llvm/docs/AMDGPUOperandSyntax.rst
index acfd1b60ff3af..7aa957870f97e 100644
--- a/llvm/docs/AMDGPUOperandSyntax.rst
+++ b/llvm/docs/AMDGPUOperandSyntax.rst
@@ -14,7 +14,7 @@ The following notation is used throughout this document:
     Notation            Description
     =================== =============================================================================
     {0..N}              Any integer value in the range from 0 to N (inclusive).
-    <x>                 Syntax and meaning of *x* is explained elsewhere.
+    <x>                 Syntax and meaning of *x* are explained elsewhere.
     =================== =============================================================================
 
 .. _amdgpu_syn_operands:
@@ -31,7 +31,7 @@ Vector registers. There are 256 32-bit vector registers.
 
 A sequence of *vector* registers may be used to operate with more than 32 bits of data.
 
-Assembler currently supports sequences of 1, 2, 3, 4, 5, 6, 7, 8, 16 and 32 *vector* registers.
+Assembler currently supports tuples with 1 to 12, 16 and 32 *vector* registers.
 
     =================================================== ====================================================================
     Syntax                                              Description
@@ -61,9 +61,10 @@ Note: *N* and *K* must satisfy the following conditions:
 * *N* <= *K*.
 * 0 <= *N* <= 255.
 * 0 <= *K* <= 255.
-* *K-N+1* must be equal to 1, 2, 3, 4, 5, 6, 7, 8, 16 or 32.
+* *K-N+1* must be in the range from 1 to 12 or equal to 16 or 32.
 
-GFX90A has an additional alignment requirement: pairs of *vector* registers must be even-aligned
+GFX90A and GFX940 have an additional alignment requirement:
+pairs of *vector* registers must be even-aligned
 (first register must be even).
 
 Examples:
@@ -82,19 +83,20 @@ Examples:
 
 .. _amdgpu_synid_nsa:
 
-GFX10 *Image* instructions may use special *NSA* (Non-Sequential Address) syntax for *image addresses*:
+GFX10+ *image* instructions may use special *NSA* (Non-Sequential Address)
+syntax for *image addresses*:
 
     ===================================== =================================================
     Syntax                                Description
     ===================================== =================================================
     **[Vm**, \ **Vn**, ... **Vk**\ **]**  A sequence of 32-bit *vector* registers.
-                                          Each register may be specified using syntax
+                                          Each register may be specified using the syntax
                                           defined :ref:`above<amdgpu_synid_v>`.
 
-                                          In contrast with standard syntax, registers
+                                          In contrast with the standard syntax, registers
                                           in *NSA* sequence are not required to have
                                           consecutive indices. Moreover, the same register
-                                          may appear in the list more than once.
+                                          may appear in the sequence more than once.
     ===================================== =================================================
 
 Examples:
@@ -114,10 +116,10 @@ Accumulator registers. There are 256 32-bit accumulator registers.
 
 A sequence of *accumulator* registers may be used to operate with more than 32 bits of data.
 
-Assembler currently supports sequences of 1, 2, 3, 4, 5, 6, 7, 8, 16 and 32 *accumulator* registers.
+Assembler currently supports tuples with 1 to 12, 16 and 32 *accumulator* registers.
 
     =================================================== ========================================================= ====================================================================
-    Syntax                                              An Alternative Syntax (SP3)                               Description
+    Syntax                                              Alternative Syntax (SP3)                                  Description
     =================================================== ========================================================= ====================================================================
     **a**\<N>                                           **acc**\<N>                                               A single 32-bit *accumulator* register.
 
@@ -144,9 +146,10 @@ Note: *N* and *K* must satisfy the following conditions:
 * *N* <= *K*.
 * 0 <= *N* <= 255.
 * 0 <= *K* <= 255.
-* *K-N+1* must be equal to 1, 2, 3, 4, 5, 6, 7, 8, 16 or 32.
+* *K-N+1* must be in the range from 1 to 12 or equal to 16 or 32.
 
-GFX90A has an additional alignment requirement: pairs of *accumulator* registers must be even-aligned
+GFX90A and GFX940 have an additional alignment requirement:
+pairs of *accumulator* registers must be even-aligned
 (first register must be even).
 
 Examples:
@@ -173,7 +176,7 @@ Examples:
 s
 -
 
-Scalar 32-bit registers. The number of available *scalar* registers depends on GPU:
+Scalar 32-bit registers. The number of available *scalar* registers depends on the GPU:
 
     ======= ============================
     GPU     Number of *scalar* registers
@@ -181,11 +184,11 @@ Scalar 32-bit registers. The number of available *scalar* registers depends on G
     GFX7    104
     GFX8    102
     GFX9    102
-    GFX10   106
+    GFX10+  106
     ======= ============================
 
 A sequence of *scalar* registers may be used to operate with more than 32 bits of data.
-Assembler currently supports sequences of 1, 2, 3, 4, 5, 6, 7, 8, 16 and 32 *scalar* registers.
+Assembler currently supports tuples with 1 to 12, 16 and 32 *scalar* registers.
 
 Pairs of *scalar* registers must be even-aligned (first register must be even).
 Sequences of 4 and more *scalar* registers must be quad-aligned.
@@ -217,11 +220,11 @@ Sequences of 4 and more *scalar* registers must be quad-aligned.
 
 Note: *N* and *K* must satisfy the following conditions:
 
-* *N* must be properly aligned based on sequence size.
+* *N* must be properly aligned based on the sequence size.
 * *N* <= *K*.
 * 0 <= *N* < *SMAX*\ , where *SMAX* is the number of available *scalar* registers.
 * 0 <= *K* < *SMAX*\ , where *SMAX* is the number of available *scalar* registers.
-* *K-N+1* must be equal to 1, 2, 3, 4, 5, 6, 7, 8, 16 or 32.
+* *K-N+1* must be in the range from 1 to 12 or equal to 16 or 32.
 
 Examples:
 
@@ -261,7 +264,7 @@ ttmp
 ----
 
 Trap handler temporary scalar registers, 32-bits wide.
-The number of available *ttmp* registers depends on GPU:
+The number of available *ttmp* registers depends on the GPU:
 
     ======= ===========================
     GPU     Number of *ttmp* registers
@@ -269,11 +272,11 @@ The number of available *ttmp* registers depends on GPU:
     GFX7    12
     GFX8    12
     GFX9    16
-    GFX10   16
+    GFX10+  16
     ======= ===========================
 
 A sequence of *ttmp* registers may be used to operate with more than 32 bits of data.
-Assembler currently supports sequences of 1, 2, 3, 4, 5, 6, 7, 8 and 16 *ttmp* registers.
+Assembler currently supports tuples with 1 to 12 and 16 *ttmp* registers.
 
 Pairs of *ttmp* registers must be even-aligned (first register must be even).
 Sequences of 4 and more *ttmp* registers must be quad-aligned.
@@ -303,11 +306,11 @@ Sequences of 4 and more *ttmp* registers must be quad-aligned.
 
 Note: *N* and *K* must satisfy the following conditions:
 
-* *N* must be properly aligned based on sequence size.
+* *N* must be properly aligned based on the sequence size.
 * *N* <= *K*.
 * 0 <= *N* < *TMAX*, where *TMAX* is the number of available *ttmp* registers.
 * 0 <= *K* < *TMAX*, where *TMAX* is the number of available *ttmp* registers.
-* *K-N+1* must be equal to 1, 2, 3, 4, 5, 6, 7, 8 or 16.
+* *K-N+1* must be in the range from 1 to 12 or equal to 16.
 
 Examples:
 
@@ -335,7 +338,8 @@ Examples of *ttmp* registers with an invalid alignment:
 tba
 ---
 
-Trap base address, 64-bits wide. Holds the pointer to the current trap handler program.
+Trap base address, 64-bits wide. Holds the pointer to the current
+trap handler program.
 
     ================== ======================================================================= =============
     Syntax             Description                                                             Availability
@@ -356,9 +360,6 @@ High and low 32 bits of *trap base address* may be accessed as separate register
     [tba_hi]           High 32 bits of *trap base address* register (an SP3 syntax).           GFX7, GFX8
     ================== ======================================================================= =============
 
-Note that *tba*, *tba_lo* and *tba_hi* are not accessible as assembler registers in GFX9 and GFX10,
-but *tba* is readable/writable with the help of *s_get_reg* and *s_set_reg* instructions.
-
 .. _amdgpu_synid_tma:
 
 tma
@@ -385,9 +386,6 @@ High and low 32 bits of *trap memory address* may be accessed as separate regist
     [tma_hi]          High 32 bits of *trap memory address* register (an SP3 syntax).         GFX7, GFX8
     ================= ======================================================================= ==================
 
-Note that *tma*, *tma_lo* and *tma_hi* are not accessible as assembler registers in GFX9 and GFX10,
-but *tma* is readable/writable with the help of *s_get_reg* and *s_set_reg* instructions.
-
 .. _amdgpu_synid_flat_scratch:
 
 flat_scratch
@@ -414,10 +412,6 @@ High and low 32 bits of *flat scratch* address may be accessed as separate regis
     [flat_scratch_hi]         High 32 bits of *flat scratch* address register (an SP3 syntax).
     ========================= =========================================================================
 
-Note that *flat_scratch*, *flat_scratch_lo* and *flat_scratch_hi* are not accessible as assembler
-registers in GFX10, but *flat_scratch* is readable/writable with the help of
-*s_get_reg* and *s_set_reg* instructions.
-
 .. _amdgpu_synid_xnack:
 .. _amdgpu_synid_xnack_mask:
 
@@ -427,9 +421,7 @@ xnack_mask
 Xnack mask, 64-bits wide. Holds a 64-bit mask of which threads
 received an *XNACK* due to a vector memory operation.
 
-.. WARNING:: GFX7 does not support *xnack* feature. For availability of this feature in other GPUs, refer :ref:`this table<amdgpu-processors>`.
-
-\
+For availability of *xnack* feature, refer to :ref:`this table<amdgpu-processors>`.
 
     ============================== =====================================================
     Syntax                         Description
@@ -450,10 +442,6 @@ High and low 32 bits of *xnack mask* may be accessed as separate registers:
     [xnack_mask_hi]       High 32 bits of *xnack mask* register (an SP3 syntax).
     ===================== ==============================================================
 
-Note that *xnack_mask*, *xnack_mask_lo* and *xnack_mask_hi* are not accessible as assembler
-registers in GFX10, but *xnack_mask* is readable/writable with the help of
-*s_get_reg* and *s_set_reg* instructions.
-
 .. _amdgpu_synid_vcc:
 .. _amdgpu_synid_vcc_lo:
 
@@ -463,7 +451,7 @@ vcc
 Vector condition code, 64-bits wide. A bit mask with one bit per thread;
 it holds the result of a vector compare operation.
 
-Note that GFX10 H/W does not use high 32 bits of *vcc* in *wave32* mode.
+Note that GFX10+ H/W does not use high 32 bits of *vcc* in *wave32* mode.
 
     ================ =========================================================================
     Syntax           Description
@@ -508,7 +496,7 @@ Execute mask, 64-bits wide. A bit mask with one bit per thread,
 which is applied to vector instructions and controls which threads execute
 and which ignore the instruction.
 
-Note that GFX10 H/W does not use high 32 bits of *exec* in *wave32* mode.
+Note that GFX10+ H/W does not use high 32 bits of *exec* in *wave32* mode.
 
     ===================== =================================================================
     Syntax                Description
@@ -534,18 +522,22 @@ High and low 32 bits of *execute mask* may be accessed as separate registers:
 vccz
 ----
 
-A single bit flag indicating that the :ref:`vcc<amdgpu_synid_vcc>` is all zeros.
+A single bit flag indicating that the :ref:`vcc<amdgpu_synid_vcc>`
+is all zeros.
 
-Note: when GFX10 operates in *wave32* mode, this register reflects state of :ref:`vcc_lo<amdgpu_synid_vcc_lo>`.
+Note: when GFX10+ operates in *wave32* mode, this register reflects
+the state of :ref:`vcc_lo<amdgpu_synid_vcc_lo>`.
 
 .. _amdgpu_synid_execz:
 
 execz
 -----
 
-A single bit flag indicating that the :ref:`exec<amdgpu_synid_exec>` is all zeros.
+A single bit flag indicating that the :ref:`exec<amdgpu_synid_exec>`
+is all zeros.
 
-Note: when GFX10 operates in *wave32* mode, this register reflects state of :ref:`exec_lo<amdgpu_synid_exec>`.
+Note: when GFX10+ operates in *wave32* mode, this register reflects
+the state of :ref:`exec_lo<amdgpu_synid_exec>`.
 
 .. _amdgpu_synid_scc:
 
@@ -567,34 +559,31 @@ fetched from *LDS* memory using :ref:`m0<amdgpu_synid_m0>` as an address.
 null
 ----
 
-This is a special operand which may be used as a source or a destination.
+This is a special operand that may be used as a source or a destination.
 
 When used as a destination, the result of the operation is discarded.
 
 When used as a source, it supplies zero value.
 
-GFX10 only.
-
-.. WARNING:: Due to a H/W bug, this operand cannot be used with VALU instructions in first generation of GFX10.
-
 .. _amdgpu_synid_constant:
 
 inline constant
 ---------------
 
-An *inline constant* is an integer or a floating-point value encoded as a part of an instruction.
-Compare *inline constants* with :ref:`literals<amdgpu_synid_literal>`.
+An *inline constant* is an integer or a floating-point value
+encoded as a part of an instruction. Compare *inline constants*
+with :ref:`literals<amdgpu_synid_literal>`.
 
 Inline constants include:
 
-* :ref:`iconst<amdgpu_synid_iconst>`
-* :ref:`fconst<amdgpu_synid_fconst>`
-* :ref:`ival<amdgpu_synid_ival>`
+* :ref:`Integer inline constants<amdgpu_synid_iconst>`;
+* :ref:`Floating-point inline constants<amdgpu_synid_fconst>`;
+* :ref:`Inline values<amdgpu_synid_ival>`.
 
 If a number may be encoded as either
 a :ref:`literal<amdgpu_synid_literal>` or
 a :ref:`constant<amdgpu_synid_constant>`,
-assembler selects the latter encoding as more efficient.
+the assembler selects the latter encoding as more efficient.
 
 .. _amdgpu_synid_iconst:
 
@@ -607,7 +596,7 @@ encoded as an *inline constant*.
 
 Only a small fraction of integer numbers may be encoded as *inline constants*.
 They are enumerated in the table below.
-Other integer numbers have to be encoded as :ref:`literals<amdgpu_synid_literal>`.
+Other integer numbers are encoded as :ref:`literals<amdgpu_synid_literal>`.
 
     ================================== ====================================
     Value                              Note
@@ -616,8 +605,6 @@ Other integer numbers have to be encoded as :ref:`literals<amdgpu_synid_literal>
     {-16..-1}                          Negative integer inline constants.
     ================================== ====================================
 
-.. WARNING:: GFX7 does not support inline constants for *f16* operands.
-
 .. _amdgpu_synid_fconst:
 
 fconst
@@ -626,9 +613,10 @@ fconst
 A :ref:`floating-point number<amdgpu_synid_floating-point_number>`
 encoded as an *inline constant*.
 
-Only a small fraction of floating-point numbers may be encoded as *inline constants*.
-They are enumerated in the table below.
-Other floating-point numbers have to be encoded as :ref:`literals<amdgpu_synid_literal>`.
+Only a small fraction of floating-point numbers may be encoded
+as *inline constants*. They are enumerated in the table below.
+Other floating-point numbers are encoded as
+:ref:`literals<amdgpu_synid_literal>`.
 
     ===================== ===================================================== ==================
     Value                 Note                                                  Availability
@@ -642,15 +630,13 @@ Other floating-point numbers have to be encoded as :ref:`literals<amdgpu_synid_l
     -1.0                  Floating-point constant -1.0                          All GPUs
     -2.0                  Floating-point constant -2.0                          All GPUs
     -4.0                  Floating-point constant -4.0                          All GPUs
-    0.1592                1.0/(2.0*pi). Use only for 16-bit operands.           GFX8, GFX9, GFX10
-    0.15915494            1.0/(2.0*pi). Use only for 16- and 32-bit operands.   GFX8, GFX9, GFX10
-    0.15915494309189532   1.0/(2.0*pi).                                         GFX8, GFX9, GFX10
+    0.1592                1.0/(2.0*pi). Use only for 16-bit operands.           GFX8+
+    0.15915494            1.0/(2.0*pi). Use only for 16- and 32-bit operands.   GFX8+
+    0.15915494309189532   1.0/(2.0*pi).                                         GFX8+
     ===================== ===================================================== ==================
 
 .. WARNING:: Floating-point inline constants cannot be used with *16-bit integer* operands. \
-             Assembler will attempt to encode these values as literals.
-
-.. WARNING:: GFX7 does not support inline constants for *f16* operands.
+             Assembler encodes these values as literals.
 
 .. _amdgpu_synid_ival:
 
@@ -660,42 +646,45 @@ ival
 A symbolic operand encoded as an *inline constant*.
 These operands provide read-only access to H/W registers.
 
-    ======================== ================================================ =============
-    Syntax                   Note                                             Availability
-    ======================== ================================================ =============
-    shared_base              Base address of shared memory region.            GFX9, GFX10
-    shared_limit             Address of the end of shared memory region.      GFX9, GFX10
-    private_base             Base address of private memory region.           GFX9, GFX10
-    private_limit            Address of the end of private memory region.     GFX9, GFX10
-    pops_exiting_wave_id     A dedicated counter for POPS.                    GFX9, GFX10
-    ======================== ================================================ =============
+    ===================== ========================= ================================================ =============
+    Syntax                Alternative Syntax (SP3)  Note                                             Availability
+    ===================== ========================= ================================================ =============
+    shared_base           src_shared_base           Base address of shared memory region.            GFX9+
+    shared_limit          src_shared_limit          Address of the end of shared memory region.      GFX9+
+    private_base          src_private_base          Base address of private memory region.           GFX9+
+    private_limit         src_private_limit         Address of the end of private memory region.     GFX9+
+    pops_exiting_wave_id  src_pops_exiting_wave_id  A dedicated counter for POPS.                    GFX9, GFX10
+    ===================== ========================= ================================================ =============
 
 .. _amdgpu_synid_literal:
 
 literal
 -------
 
-A *literal* is a 64-bit value encoded as a separate 32-bit dword in the instruction stream.
-Compare *literals* with :ref:`inline constants<amdgpu_synid_constant>`.
+A *literal* is a 64-bit value encoded as a separate
+32-bit dword in the instruction stream. Compare *literals*
+with :ref:`inline constants<amdgpu_synid_constant>`.
 
 If a number may be encoded as either
 a :ref:`literal<amdgpu_synid_literal>` or
 an :ref:`inline constant<amdgpu_synid_constant>`,
 assembler selects the latter encoding as more efficient.
 
-Literals may be specified as :ref:`integer numbers<amdgpu_synid_integer_number>`,
+Literals may be specified as
+:ref:`integer numbers<amdgpu_synid_integer_number>`,
 :ref:`floating-point numbers<amdgpu_synid_floating-point_number>`,
 :ref:`absolute expressions<amdgpu_synid_absolute_expression>` or
 :ref:`relocatable expressions<amdgpu_synid_relocatable_expression>`.
 
-An instruction may use only one literal but several operands may refer the same literal.
+An instruction may use only one literal,
+but several operands may refer to the same literal.
 
 .. _amdgpu_synid_uimm8:
 
 uimm8
 -----
 
-A 8-bit :ref:`integer number<amdgpu_synid_integer_number>`
+An 8-bit :ref:`integer number<amdgpu_synid_integer_number>`
 or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 The value must be in the range 0..0xFF.
 
@@ -756,7 +745,8 @@ Integer numbers are 64 bits wide.
 They are converted to :ref:`expected operand type<amdgpu_syn_instruction_type>`
 as described :ref:`here<amdgpu_synid_int_conv>`.
 
-Integer numbers may be specified in binary, octal, hexadecimal and decimal formats:
+Integer numbers may be specified in binary, octal,
+hexadecimal and decimal formats:
 
     ============ =============================== ========
     Format       Syntax                          Example
@@ -829,22 +819,23 @@ Relocatable Expressions
 
 The value of a relocatable expression depends on program relocation.
 
-Note that use of relocatable expressions is limited with branch targets
+Note that use of relocatable expressions is limited to branch targets
 and 32-bit integer operands.
 
-A relocatable expression is evaluated to a 64-bit integer value
-which depends on operand kind and :ref:`relocation type<amdgpu-relocation-records>`
-of symbol(s) used in the expression. For example, if an instruction refers a label,
-this reference is evaluated to an offset from the address after the instruction
-to the label address:
+A relocatable expression is evaluated to a 64-bit integer value,
+which depends on operand kind and
+:ref:`relocation type<amdgpu-relocation-records>` of symbol(s)
+used in the expression. For example, if an instruction refers to a label,
+this reference is evaluated to an offset from the address after
+the instruction to the label address:
 
 .. parsed-literal::
 
     label:
     v_add_co_u32_e32 v0, vcc, label, v1  // 'label' operand is evaluated to -4
 
-Note that values of relocatable expressions are usually unknown at assembly time;
-they are resolved later by a linker and converted to
+Note that values of relocatable expressions are usually unknown
+at assembly time; they are resolved later by a linker and converted to
 :ref:`expected operand type<amdgpu_syn_instruction_type>`
 as described :ref:`here<amdgpu_synid_rl_conv>`.
 
@@ -855,9 +846,11 @@ Expressions are composed of 64-bit integer operands and operations.
 Operands include :ref:`integer numbers<amdgpu_synid_integer_number>`
 and :ref:`symbols<amdgpu_synid_symbol>`.
 
-Expressions may also use "." which is a reference to the current PC (program counter).
+Expressions may also use "." which is a reference
+to the current PC (program counter).
 
-:ref:`Unary<amdgpu_synid_expression_un_op>` and :ref:`binary<amdgpu_synid_expression_bin_op>`
+:ref:`Unary<amdgpu_synid_expression_un_op>` and
+:ref:`binary<amdgpu_synid_expression_bin_op>`
 operations produce 64-bit integer results.
 
 Syntax of Expressions
@@ -988,20 +981,25 @@ is used for an operand which has a 
diff erent type or size.
 Conversion of Integer Values
 ----------------------------
 
-Instruction operands may be specified as 64-bit :ref:`integer numbers<amdgpu_synid_integer_number>` or
-:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. These values are converted to
-the :ref:`expected operand type<amdgpu_syn_instruction_type>` using the following steps:
+Instruction operands may be specified as 64-bit
+:ref:`integer numbers<amdgpu_synid_integer_number>` or
+:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+These values are converted to the
+:ref:`expected operand type<amdgpu_syn_instruction_type>`
+using the following steps:
 
-1. *Validation*. Assembler checks if the input value may be truncated without loss to the required *truncation width*
-(see the table below). There are two cases when this operation is enabled:
+1. *Validation*. Assembler checks if the input value may be truncated
+without loss to the required *truncation width* (see the table below).
+There are two cases when this operation is enabled:
 
     * The truncated bits are all 0.
     * The truncated bits are all 1 and the value after truncation has its MSB bit set.
 
-In all other cases assembler triggers an error.
+In all other cases, the assembler triggers an error.
 
-2. *Conversion*. The input value is converted to the expected type as described in the table below.
-Depending on operand kind, this conversion is performed by either assembler or AMDGPU H/W (or both).
+2. *Conversion*. The input value is converted to the expected type
+as described in the table below. Depending on operand kind, this conversion
+is performed by either assembler or AMDGPU H/W (or both).
 
     ============== ================= =============== ====================================================================
     Expected type  Truncation Width  Conversion      Description
@@ -1055,21 +1053,26 @@ Examples of disabled conversions:
 Conversion of Floating-Point Values
 -----------------------------------
 
-Instruction operands may be specified as 64-bit :ref:`floating-point numbers<amdgpu_synid_floating-point_number>`.
-These values are converted to the :ref:`expected operand type<amdgpu_syn_instruction_type>` using the following steps:
+Instruction operands may be specified as 64-bit
+:ref:`floating-point numbers<amdgpu_synid_floating-point_number>`.
+These values are converted to the
+:ref:`expected operand type<amdgpu_syn_instruction_type>`
+using the following steps:
 
 1. *Validation*. Assembler checks if the input f64 number can be converted
-to the *required floating-point type* (see the table below) without overflow or underflow.
-Precision lost is allowed. If this conversion is not possible, assembler triggers an error.
+to the *required floating-point type* (see the table below) without overflow
+or underflow. Precision lost is allowed. If this conversion is not possible,
+the assembler triggers an error.
 
-2. *Conversion*. The input value is converted to the expected type as described in the table below.
-Depending on operand kind, this is performed by either assembler or AMDGPU H/W (or both).
+2. *Conversion*. The input value is converted to the expected type
+as described in the table below. Depending on operand kind, this is
+performed by either assembler or AMDGPU H/W (or both).
 
     ============== ================ ================= =================================================================
     Expected type  Required FP Type Conversion        Description
     ============== ================ ================= =================================================================
     i16, u16, b16  f16              f16(num)          Convert to f16 and use bits of the result as an integer value.
-                                                      The value has to be encoded as a literal or an error occurs.
+                                                      The value has to be encoded as a literal, or an error occurs.
                                                       Note that the value cannot be encoded as an inline constant.
     i32, u32, b32  f32              f32(num)          Convert to f32 and use bits of the result as an integer value.
     i64, u64, b64  \-               \-                Conversion disabled.
@@ -1122,8 +1125,9 @@ When the value of a relocatable expression is resolved by a linker, it is
 converted as needed and truncated to the operand size. The conversion depends
 on :ref:`relocation type<amdgpu-relocation-records>` and operand kind.
 
-For example, when a 32-bit operand of an instruction refers a relocatable expression *expr*,
-this reference is evaluated to a 64-bit offset from the address after the
+For example, when a 32-bit operand of an instruction refers
+to a relocatable expression *expr*, this reference is evaluated
+to a 64-bit offset from the address after the
 instruction to the address being referenced, *counted in bytes*.
 Then the value is truncated to 32 bits and encoded as a literal:
 
@@ -1133,7 +1137,7 @@ Then the value is truncated to 32 bits and encoded as a literal:
     v_add_co_u32_e32 v0, vcc, expr, v1  // 'expr' operand is evaluated to -4
                                         // and then truncated to 0xFFFFFFFC
 
-As another example, when a branch instruction refers a label,
+As another example, when a branch instruction refers to a label,
 this reference is evaluated to an offset from the address after the
 instruction to the label address, *counted in dwords*.
 Then the value is truncated to 16 bits:


        


More information about the llvm-commits mailing list