[PATCH] D26743: Expandload and Compressing store - documentation update
Ayal Zaks via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 16 23:26:23 PST 2016
Ayal added inline comments.
================
Comment at: docs/LangRef.rst:11857
+
+LLVM provides intrinsics for expanding load and compressing store operations. Compressing store is designated to select single elements from data vector and store them in a dense form. The selection is done using mask operand, which holds one bit per vector element. The number of stored elements is equal to the number of '1' bits in the mask. Expanding load performs the opposite operation - reads a number of sequential scalar elements from memory and spreads them in a vector according to the mask. The total number and position of active bits inside the mask vector state the number of loaded elements and their disposition in the result vector.
+
----------------
Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond) a[i++] = v.i" and "if (cond) v.i = a[i++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to llvm.masked.store and llvm.masked.load [link to them].
================
Comment at: docs/LangRef.rst:11866
+"""""""
+This is an overloaded intrinsic. The loaded data is a number of scalar values of any integer, floating point or pointer data type loaded together and spread according to the mask into one vector.
+
----------------
"The loaded data is a number of scalar values of any integer," >> "Several scalar values of integer,"
"loaded together" >> "are loaded from consecutive memory addresses"
"spread according to the mask into one vector" >> "stored into the elements of a vector according to the mask"
================
Comment at: docs/LangRef.rst:11876
+
+Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spread them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. If the mask vector is '10010001', the "explandload" reads 3 values from memory and position them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' operand.
+
----------------
spread >> spreads
If the mask >> E.g., if the mask
the "expandload" >> "expandload"
from memory >> from memory addresses ptr, ptr+1, ptr+2
"position" >> "positions" (or "places")
================
Comment at: docs/LangRef.rst:11902
+ ; Load N elements from array B and expand them in a vector.
+ ; N is equal to the number of 'true' elements in the mask.
+ %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(double* %Bptr, <8 x i1> %mask, <8 x double> undef)
----------------
N?
================
Comment at: docs/LangRef.rst:11905
+ ; Store the result in A
+ call <void> @llvm.masked.store.v8f64.p0v8f64(<8 x double> %Tmp, <8 x double> %Aptr, i32 8, <8 x i1> %mask)
+
----------------
%Aptr should have pointer type
================
Comment at: docs/LangRef.rst:11908
+
+Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operation and shuffles.
+If all mask elements are 'true', the intrinsic behavior is equivalent to the regular vector load.
----------------
load operation >> load operations
================
Comment at: docs/LangRef.rst:11909
+Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operation and shuffles.
+If all mask elements are 'true', the intrinsic behavior is equivalent to the regular vector load.
+
----------------
regular >> regular unmasked
Repository:
rL LLVM
https://reviews.llvm.org/D26743
More information about the llvm-commits
mailing list