[PATCH] D56118: [ARM]: Add optimized NEON uint64x2_t multiply routine.
Eli Friedman via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Dec 29 13:27:50 PST 2018
efriedma added a comment.
Please upload patches with full context (see https://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface).
================
Comment at: lib/Target/ARM/ARMISelLowering.cpp:7465
+ KnownBits topKnown = DAG.computeKnownBits(top);
+ KnownBits botKnown = DAG.computeKnownBits(bot);
+
----------------
Alternatively, we could try to handle simplifications by adding a DAGCombine for VSHRu and VMULLu nodes with constant operands, instead of trying to avoid emitting them. The end result is roughly the same (it's probably unlikely to construct a trivial VSHRu or VMULLu any other way), but it might be easier to understand. But I guess it would get tricky if we tried to emit the form using "vrev64". Probably okay as-is.
Needs more test coverage for various cases where the upper/lower half an operand is zero.
================
Comment at: lib/Target/ARM/ARMTargetTransformInfo.cpp:528
+ { ISD::UMULO, MVT::v2i64, 11},
+ { ISD::SMULO, MVT::v2i64, 11},
};
----------------
Did you really mean UMULO (llvm.umul.with.overflow)? I mean, I guess it's theoretically possible to implement, but I think we'll just scalarize it at the moment.
(The cost model changes need test coverage in test/Analysis/CostModel/ARM/ .)
================
Comment at: test/CodeGen/ARM/vmul.ll:69
+;CHECK: vld1.64 {d20, d21}, [r0]
+;CHECK; vand q8, q10, q8
+;CHECK: vmovn.i64 d18, q9
----------------
DAGCombine should be able to catch the redundant AND... but it looks like DAGCombiner::visitTRUNCATE doesn't try to handle demanded bits for vectors. (I guess it didn't get updated when other operations got support for vector operands?)
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D56118/new/
https://reviews.llvm.org/D56118
More information about the llvm-commits
mailing list