[PATCH] D41863: [AArch64] Fix incorrect LD1 of 16-bit FP vectors in big endian

Mon Jan 15 10:57:59 PST 2018

pbarrio added a comment.

Hi Sjoerd, thanks for the review. I have attached some thoughts on your comments and I will upload a new patch soon.

================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:736
+    AddPromotedToType(ISD::STORE, VT, MVT::v4i16);
   } else if (VT == MVT::v2f64 || VT == MVT::v4f32 || VT == MVT::v8f16) {
     setOperationAction(ISD::LOAD, VT, Promote);
----------------
SjoerdMeijer wrote:
> How about MVT::v8f16? Does this one needs to get a similar treatment?
I could do the same for MVT::v8f16, but this would be an optimization rather than a bugfix, since LLVM generates correct code in this case:

```
        ld1     { v0.2d }, [x0]
        rev64   v0.8h, v0.8h
```

which does the correct byte reversal after the load. For the MVT::v4f16 case, LLVM currently generates the following incorrect code:

```
        ld1     { v0.2s }, [x0]
        rev64   v0.4h, v0.4h
```

I will give the optimization a try and upload an updated patch soon.

================
Comment at: lib/Target/AArch64/AArch64InstrInfo.td:5852
 def : Pat<(v4f16 (bitconvert (v4i16 FPR64:$src))),
-                             (v4f16 (REV64v4i16 FPR64:$src))>;
+                             (v4f16 FPR64:$src)>;
 def : Pat<(v4f16 (bitconvert (v8i8  FPR64:$src))),
----------------
SjoerdMeijer wrote:
> Don't think I understand why this is now a special case. Why is this one different from the other patterns here?
Earlier in the file there is an explanation about why we need to insert REVs when we do bitcasts in big endian (~line 5600). At the end, the following comment suggests that identity conversions (e.g. same-size float-to-int conversions) do not need it:

```

 // Most bitconverts require some sort of conversion. The only exceptions are:
 //   a) Identity conversions -  vNfX <-> vNiX
```

It makes sense to me that a type conversion in big endian between vectors with elements of the same size does not need a byte reversal. This reversal should have been done right before, otherwise the original vector (in this case, with type v4f16) would have also been incorrect.

https://reviews.llvm.org/D41863