[PATCH] D71074: [Aarch64][SVE] Add intrinsics for scatter stores
Andrzej Warzynski via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 12 06:42:37 PST 2019
andwar marked 6 inline comments as done.
andwar added inline comments.
================
Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:990
+ : Intrinsic<
+ [],
+ [
----------------
sdesmalen wrote:
> nit: The formatting for these intrinsics is odd.
Fixed - I try to use clang-format (which IMHO does a good job here), but sadly the result is inconsistent with the rest of the file.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:11851
+ unsigned Opcode) {
+ EVT SrcVT = N->getOperand(2)->getValueType(0);
+ assert(SrcVT.isScalableVector() &&
----------------
sdesmalen wrote:
> It is unclear what `N->getOperand(2)` is:
> ```EVT SrcVT = N->getOperand(2)->getValueType(0);
> const SDValue Dst = N->getOperand(2);```
That's a copy & paste leftover, sorry! I've renamed `Dst` as well as few other variables for consistency.
================
Comment at: llvm/test/CodeGen/AArch64/sve-intrinsics-scatter-stores-32bit-scaled-offsets.ll:157
+ double* %base,
+ <vscale x 2 x i64> %indices)
+ ret void
----------------
sdesmalen wrote:
> efriedma wrote:
> > andwar wrote:
> > > efriedma wrote:
> > > > Why are the offsets here using `<vscale x 2 x i64>`? `<vscale x 2 x i32>` seems more natural.
> > > * Although these _are_ 32-bit wide offsets, they are stored in 64-bit registers (and are implicitly sign or zero-extended)
> > > * By using `<vscale x 2 x i64>` we can rely on `LLVMScalarOrSameVectorWidth` to capture error (i.e. `incorrect argument type`)
> > > * This is consistent with the implementation for gather loads
> > >
> > > Do you see any disadvantages of using `<vscale x 2 x i64>` instead of `<vscale x 2 x i32>`?
> > The biggest advantage of using `<vscale x 2 x i32>` is that it's obvious the high bits are unused, so you don't end up with extra mask/64-bit multiply/etc. That said, we can recover this in other ways.
> >
> > I'm mostly concerned that we should try to be consistent here. We use `<vscale x 2 x i32>` for the stored value in `llvm.aarch64.sve.st1.scatter.index.nxv2i32`.
> We envisioned the type for the indices generated by Clang to be `<vscale x 2 x i64>`, since the ACLE does not have any knowledge of unpacked types (like `<vscale x 2 x i32>`). The uxtw in the intrinsic would make it clear that the indices will need to be zero-extended from from word to double-word.
>
> However, I agree this isn't very clear when reading/writing the intrinsics in IR and it should be simple to change the patch to take `<vscale x 2 x i32>` indices instead and generating a `truncate` in Clang. If the `scatterSt1Combine` then ANY_EXTENDs the data, the truncate will likely fall away.
>
> If we change this, we should be consistent and update the loads as well.
Thank you both, I'll update accordingly. I suggest that gather loads are updated in a separate patch.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D71074/new/
https://reviews.llvm.org/D71074
More information about the llvm-commits
mailing list