[PATCH] D30364: AArch64 : Add PreferCSEL feature for Exynos-M3.

Fri Mar 3 07:25:47 PST 2017

On 3 March 2017 at 08:42, James Molloy <James.Molloy at arm.com> wrote:
> Pat<>s are never used in the assembler, unless something fundamental has changed recently! What the assembler sees is controlled by Predicate<>s on the Instruction defs. The Pat<> classes only apply to the instruction selector.

Hi James,

This is solely my fault. I got confused on the constructs used for
code generation and pattern selection.

> So it is completely feasible to disable instruction selection patterns but keep the instruction selectable and disassemblable.

Point taken.

> Junmo’s first patch does this:
>
> def : Pat<(AArch64csel (i32 0), (i32 1), (i32 imm:$cc), NZCV),
>           (CSINCWr WZR, WZR, (i32 imm:$cc))>, Requires<[DontUseCSEL]>;
>
> With the Requires<> being added by Junmo. All this does is disable selection of CSINCWr from a pattern involving a CSEL. CSEL is still a valid instruction, and so is CSINC, so the assembler will parse and emit both without warning or error.

I have to say, my confusion that the fact that the description is
"Don't use CSEL" made me even more confused.

> What I would say however is that when I discussed this with Junmo earlier, he suggested it was the CSINC/CSINV instructions that were slow on Exynos-M3, *not* CSEL. That is backed up by his code change, but *not* by the naming of the feature, “DontUseCSEL”. In fact, CSINC has been disabled with the feature “DontUseCSEL”, which is confusing at best. CSEL’s patterns aren’t even disabled by this option, so it’s not well named.

Right, Junmo's suggestion to stick to HasFastCSEL / HasSlowCSEL is
much better in that respect.

We have used similar target features for target lowering, and there's
where my confusion started. Evandro has had multiple discussions with
us about that and we have found many ways to work around modelling
costs, rather than disabling code gen (which is not this case, but you
get the point).

My idea is to discuss an integrated cost model in Global ISel for that
particular kind of problem, and one that could essentially use the
scheduler's information alone with some cost computation pre/post
transformation. This is likely going to increase compile time so we
really only want to do that on the parts that can have sub-arch
differences, not for everything, but having the infrastructure would
be a good thing overall, I think. I'll put this topic on discussion at
EuroLLVM, but for now, we're stuck with SelDAG, so let's just get on
with it.

cheers,
--renato