[PATCH] D74620: [ARM,MVE] Add vector-scalar intrinsics
Dave Green via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Feb 17 00:22:23 PST 2020
dmgreen added a comment.
I like how this uses a splat for all the register arguments. That sounds like a good idea.
The one's that worry me are the floating point instructions. Last time we tried those it was actually causing performance regressions because of extra sp->gpr mov's left in the loop.
If this is just the backend patterns though, not the sinking of splats into loops too, then I think it should be OK. On it's own I don't think it will usually cause problems. And some quick tests seem to verify that.
================
Comment at: clang/test/CodeGen/arm-mve-intrinsics/vaddq.c:2
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
-// RUN: %clang_cc1 -triple thumbv8.1m.main-arm-none-eabi -target-feature +mve.fp -mfloat-abi hard -fallow-half-arguments-and-returns -O0 -disable-O0-optnone -S -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s
-// RUN: %clang_cc1 -triple thumbv8.1m.main-arm-none-eabi -target-feature +mve.fp -mfloat-abi hard -fallow-half-arguments-and-returns -O0 -disable-O0-optnone -DPOLYMORPHIC -S -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s
+// RUN: %clang_cc1 -triple thumbv8.1m.main-arm-none-eabi -target-feature +mve.fp -mfloat-abi hard -fallow-half-arguments-and-returns -O0 -disable-O0-optnone -S -emit-llvm -o - %s | opt -S -O1 | FileCheck %s
+// RUN: %clang_cc1 -triple thumbv8.1m.main-arm-none-eabi -target-feature +mve.fp -mfloat-abi hard -fallow-half-arguments-and-returns -O0 -disable-O0-optnone -DPOLYMORPHIC -S -emit-llvm -o - %s | opt -S -O1 | FileCheck %s
----------------
Why is this running the entire -O1 pass pipeline? These tests deliberately uses a limit subset to not need adjusting with every midend llvm change. (But not be littered with clang's verbose ir output).
I'm guessing the half args are being a pain again. Is it something to do with halfs?
================
Comment at: llvm/lib/Target/ARM/ARMInstrMVE.td:4496
+ UnpredSign)),
+ (VTI.Vec (inst (VTI.Vec MQPR:$Qm), (i32 GPR:$val)))>;
+ // Predicated version
----------------
These GPR's can use the same regclass as the instruction. rGPR in this case I think?
================
Comment at: llvm/lib/Target/ARM/ARMInstrMVE.td:4566
+ 0b0, VTI.Unsigned>;
+ defvar unpred_op = !if(VTI.Unsigned, unpred_op_u, unpred_op_s);
+ defm : MVE_vec_scalar_int_pat_m<!cast<Instruction>(NAME), VTI,
----------------
I find all these if's at different levels a little hard to follow. It looks OK, but is it possible to rearrange things to not need it here?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D74620/new/
https://reviews.llvm.org/D74620
More information about the cfe-commits
mailing list