[llvm] [LLVM] Add `llvm.masked.compress` intrinsic (PR #92289)
Eli Friedman via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 27 16:59:29 PDT 2024
================
@@ -19234,6 +19234,78 @@ the follow sequence of operations:
The ``mask`` operand will apply to at least the gather and scatter operations.
+
+.. _int_masked_compress:
+
+'``llvm.masked.compress.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+LLVM provides an intrinsic for compressing data within a vector based on a selection mask.
+Semantically, this is similar to :ref:`llvm.masked.compressstore <int_compressstore>` but with weaker assumptions
+and without storing the results to memory, i.e., the data remains in the vector.
+
+Syntax:
+"""""""
+This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected
+from an input vector and placed adjacently within the result vector. A mask defines which elements to collect from the vector.
+The remaining lanes are filled with values from ``passthru``.
+
+:: code-block:: llvm
+
+ declare <8 x i32> @llvm.masked.compress.v8i32(<8 x i32> <value>, <8 x i1> <mask>, <8 x i32> <passthru>)
+ declare <16 x float> @llvm.masked.compress.v16f32(<16 x float> <value>, <16 x i1> <mask>, <16 x float> undef)
+
+Overview:
+"""""""""
+
+Selects elements from input vector '``value``' according to the '``mask``'.
+All selected elements are written into adjacent lanes in the result vector, from lower to higher.
+The mask holds an entry for each vector lane, and is used to select elements to be kept.
+If ``passthru`` is undefined, the number of valid lanes is equal to the number of ``true`` entries in the mask, i.e., all lanes >= number-of-selected-values are undefined.
+If a ``passthru`` vector is given, all remaining lanes are filled with the corresponding lane's value from ``passthru``.
+The main difference to :ref:`llvm.masked.compressstore <int_compressstore>` is that the we do not need to guard against memory access for unselected lanes.
----------------
efriedma-quic wrote:
If a mask element is poison, I think it's fine if the whole returned vector is poison. But preferably not undefined behavior; undefined behavior means we can't hoist, which makes everything more complicated, so I'd rather not deal with that. (I guess you could try to more narrowly state that elements after the poisoned mask bit are poison, but I don't think that's actually useful.)
If a mask element is undef... I guess we could similarly say the whole returned vector is undef? Not sure how much it matters exactly what rule we pick here.
https://github.com/llvm/llvm-project/pull/92289
More information about the llvm-commits
mailing list