[PATCH] D45339: [NVPTX] Fixed vectorized LDG for f16.

Artem Belevich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 5 16:52:22 PDT 2018


tra added inline comments.


================
Comment at: llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp:1244
+    if (EltVT == MVT::f16 && N->getValueType(0) == MVT::v2f16 &&
+        NumElts % 2 == 0) {
+      EltVT = MVT::v2f16;
----------------
bixia wrote:
> I played with this and found that the only f16 vector that can reach here is v2f16, due to the way that the legalizer is set up for the target. If we can somehow make v4f16 reach here, I believe that we will still see an error in routine GetConvertOpcode, similar to what we see for v2f16 before this fix. For this reason, I prefer an assertion, something like this:
> 
> if (EltVT == MVT::v2f16) {
>      assert(EltVT.getVectorNumElements() == 2 && "missed legalizing vector-of-f16");
> } else {
>    NumElts = EltVT.getVectorNumElements();
>     EltVT = EltVT.getVectorElementType();
> }
Please take a look at the test cases in ldg-invariants.ll in this patch. v8f16 variant also gets here. 

```
Legalizing node: t6: v8f16,ch = load<(invariant load 16 from %ir.ptr, addrspace 1)> t0, t23, undef:i64
...
Legalizing node: t24: v2f16,v2f16,v2f16,v2f16,ch = NVPTXISD::LoadV4<(invariant load 16 from %ir.ptr, addrspace 1)> t0, t23, undef:i64, Constant:i64<0>

```
Potentially v4f16 could also get here as a load of a pair of v2f16 elements. If you insist on assertion, it should accept all three cases.


https://reviews.llvm.org/D45339





More information about the llvm-commits mailing list