[PATCH] D26743: Expandload and Compressing store - documentation update
Michael Kuperstein via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Dec 7 10:50:37 PST 2016
mkuper added reviewers: hfinkel, anemet.
mkuper added a comment.
The way I understand it, the mailing list discussion ended with "let's discuss this at the BoF", and the decision post-BoF was to have a working group to decide on idiom representation, etc.
Having said that, I'm ok with this going on, since I don't see a sane way to represent this in IR that's selectable in DAG.
But for the signatures, I don't have enough non-X86 context.
Hal, Adam, does this seem sensible to you too?
================
Comment at: ../docs/LangRef.rst:11861
+
+'``llvm.masked.expandload.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
----------------
Bikeshedding - will "llvm.masked.load.expand.*" and "llvm.masked.store.compress.*" make more or less sense than the current names?
================
Comment at: ../docs/LangRef.rst:11882
+
+The first operand is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second operand, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' operand are the same vector types.
+
----------------
"are the same vector types" -> "have the same vector type"?
================
Comment at: ../docs/LangRef.rst:11887
+
+The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes in a single IR operation. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example:
+
----------------
"in a single IR operation" is redundant.
================
Comment at: ../docs/LangRef.rst:11901
+
+ ; Load several elements from array B and expand them in a vector.
+ ; The number of loaded elements is equal to the number of '1' elements in the mask.
----------------
This example is slightly confusing to me, because it's not clear "what happens to Bptr" - we need to advance it to the next iteration by adding the popcount of %mask, right?
Do we have a good way to represent that right now? It seems like "llvm.ctpop" for a <k x i1> type does the wrong thing (it's basically a nop).
Repository:
rL LLVM
https://reviews.llvm.org/D26743
More information about the llvm-commits
mailing list