[llvm] e806370 - [LangRef] Update the semantic of `experimental.get.vector.length` (#104475)

Tue Aug 27 09:38:11 PDT 2024

Author: Min-Yih Hsu
Date: 2024-08-27T09:38:07-07:00
New Revision: e8063702cfbbf39f0b92283d0588dee264b5eb2b

URL: https://github.com/llvm/llvm-project/commit/e8063702cfbbf39f0b92283d0588dee264b5eb2b
DIFF: https://github.com/llvm/llvm-project/commit/e8063702cfbbf39f0b92283d0588dee264b5eb2b.diff

LOG: [LangRef] Update the semantic of `experimental.get.vector.length` (#104475)

The previous semantics of `llvm.experimental.get.vector.length` was too
permissive such that it gave optimizers a hard time on anything related
to the number of iterations of VP-vectorized loops.

This patch tries to address this by assigning it a set of stricter
semantics similar to that of RVV's VSETVLI instructions, while being not
too RISC-V specific and leaving room for other (future) targets.

---------

Co-authored-by: Craig Topper <craig.topper at sifive.com>

Added: 
    

Modified: 
    llvm/docs/LangRef.rst

Removed: 
    


################################################################################
diff  --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 4887ae6ee99668..3581d1421febfd 100644

--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -19641,7 +19641,7 @@ vectorization factor should be multiplied by vscale.
 Semantics:
 """"""""""
 
-Returns a positive i32 value (explicit vector length) that is unknown at compile
+Returns a non-negative i32 value (explicit vector length) that is unknown at compile
 time and depends on the hardware specification.
 If the result value does not fit in the result type, then the result is
 a :ref:`poison value <poisonvalues>`.
@@ -19651,13 +19651,23 @@ in order to get the number of elements to process on each loop iteration. The
 result should be used to decrease the count for the next iteration until the
 count reaches zero.
 
-If the count is larger than the number of lanes in the type described by the
-last 2 arguments, this intrinsic may return a value less than the number of
-lanes implied by the type. The result will be at least as large as the result
-will be on any later loop iteration.
+Let ``%max_lanes`` be the number of lanes in the type described by ``%vf`` and
+``%scalable``, here are the constraints on the returned value:
 
-This intrinsic will only return 0 if the input count is also 0. A non-zero input
-count will produce a non-zero result.
+-  If ``%cnt`` equals to 0, returns 0.
+-  The returned value is always less than or equal to ``%max_lanes``.
+-  The returned value is always greater than or equal to ``ceil(%cnt / ceil(%cnt / %max_lanes))``,
+   if ``%cnt`` is non-zero.
+-  The returned values are monotonically non-increasing in each loop iteration. That is,
+   the returned value of an iteration is at least as large as that of any later
+   iteration.
+
+Note that it has the following implications:
+
+-  For a loop that uses this intrinsic, the number of iterations is equal to
+   ``ceil(%C / %max_lanes)`` where ``%C`` is the initial ``%cnt`` value.
+-  If ``%cnt`` is non-zero, the return value is non-zero as well.
+-  If ``%cnt`` is less than or equal to ``%max_lanes``, the return value is equal to ``%cnt``.
 
 '``llvm.experimental.vector.partial.reduce.add.*``' Intrinsic
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^