[PATCH] ARM: do not emit lsrh/ashr for NEON shifts

Tim Northover t.p.northover at gmail.com
Thu Oct 3 09:16:42 PDT 2013


Hi Amaury,

> That's something I considered, but then how can we deal with VSRA?
> With this solution, we can match `add (vshr ...)` for VSRA. It seems awkward to add patterns to match the icmp/0 form

At matching time, it appears to reduce to a single "VCLTZ" node, so
we'd be matching "(add (NEONvcltz ...), ...)" which is structurally
pretty much identical to the "(add (vhsr ...), ...)", though I agree
it's rather less obvious why we'd want to do that. Which is where
comments would come in.

On the other hand, it probably is the best instruction to use even
when we get a real icmp/sext/add triplet (for whatever bizarre
reason).

The problem with the vshifts/u solution is that we're trying to move
LLVM in the opposite direction where possible. Those intrinsics are
completely ignored by the mid-end optimisation passes because they
have no idea of the intended semantics.

> and I suppose we still want a VSRA instruction to be emitted instead of a comparison and a vadd, don't we?

Probably. Though that's a bit of an orthogonal question since the
answer depends on which is faster.

> Of course we can always add an intrinsic for VSRA/VRSRA (which was already emitting vshift instead of {a,l}shr btw).

The rounding ones really are difficult to model in normal LLVM IR. The
one I noticed was vshrn_n, which really could/should be a shift and a
truncate.

Cheers.

Tim.



More information about the cfe-commits mailing list