[PATCH] D106549: [AArch64][SVE] Combine bitcasts to predicate types with vector inserts of loads

Thu Jul 29 05:49:16 PDT 2021

paulwalker-arm added inline comments.
Herald added a subscriber: ctetreau.

================
Comment at: llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp:283
+// scalable stores as late as possible
+bool SVEIntrinsicOpts::optimizePredicateVectorExtract(
+    SmallSetVector<Function *, 4> &Functions) {
----------------
Is this name representative?  I think `optimizePredicateStore` is more in keeping with what is going on.

================
Comment at: llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp:292
+
+    unsigned MinSVEVectorSize, MaxSVEVectorSize;
+    std::tie(MinSVEVectorSize, MaxSVEVectorSize) = Attr.getVScaleRangeArgs();
----------------
These are really MinVScale/MaxVScale.

================
Comment at: llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp:294
+    std::tie(MinSVEVectorSize, MaxSVEVectorSize) = Attr.getVScaleRangeArgs();
+    if (MinSVEVectorSize != MaxSVEVectorSize)
+      continue;
----------------
I think this needs `|| MinSVEVectorSize == 0` to cover the vscale_range(0,0) case.  Perhaps worth adding a comment along the lines of "The transform needs to know the exact runtime length of scalable vectors".

To bring this point home perhaps it's worth introducing `unsigned SVEVectorSize = MinVScale*128;`, but I'll leave you to decide if it's worth it.

================
Comment at: llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp:312-313
+            dyn_cast<FixedVectorType>(Store->getOperand(0)->getType());
+        if (!FixedStoreType ||
+            FixedStoreType->getPrimitiveSizeInBits() != 16 * MinSVEVectorSize)
+          continue;
----------------
I've got a feeling this needs to be more specific otherwise we'll introduce endianness issues.  Specially we know predicate load/store instructions are byte based and so we should only allow the transform when the fixed length load/store instructions are also byte based.

Given this is all very specific I imagine you'll end up with something like `if (Store->getOperand(0)->getType() == PrecalculatedTy)`

================
Comment at: llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp:320
+            IntrI->getIntrinsicID() != Intrinsic::experimental_vector_extract ||
+            !IntrI->hasOneUse())
+          continue;
----------------
Is this restriction strictly necessary? I'm thinking we might have multiple stores of the same value so unless there is a real affect on code quality I'd rather not artificially restrict the transform.

================
Comment at: llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp:324-325
+        // ..that is extracting from index 0..
+        auto Idx = cast<ConstantInt>(IntrI->getOperand(1))->getZExtValue();
+        if (Idx != 0)
+          continue;
----------------
`cast<ConstantInt>(IntrI->getOperand(1))->isZero()`?

================
Comment at: llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp:356
+        IntrI->eraseFromParent();
+        BitCast->eraseFromParent();
+      }
----------------
I don't see any test to ensure the old bitcast is dead?  When combined with my previous comment you'll likely want:
```
if (IntrI->getNumUses() == 0)
  IntrI->eraseFromParent();
if (BitCast->getNumUses() == 0)
  BitCast->eraseFromParent();
```

================
Comment at: llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp:366
+// scalable loads as late as possible
+bool SVEIntrinsicOpts::optimizePredicateVectorInsert(
+    SmallSetVector<Function *, 4> &Functions) {
----------------
I haven't looked but I imagine this'll have similar issues as raise for optimizePredicateVectorExtract.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106549/new/

https://reviews.llvm.org/D106549