[LLVMdev] [PATCH] Add a Scalarize pass

Fri Nov 15 09:13:38 PST 2013

Hi Richard, 

The discussion on llvmpipe is irrelevant.  llvmpipe has its own pass manager and optimization pipe, it is not a C compiler.  

Nadav 

On Nov 15, 2013, at 3:26 AM, Richard Sandiford <rsandifo at linux.vnet.ibm.com> wrote:

> Nadav Rotem <nrotem at apple.com> writes:
>> On Nov 14, 2013, at 2:32 PM, Richard Sandiford
>> <rsandifo at linux.vnet.ibm.com> wrote:
>>> Richard Sandiford <rsandifo at linux.vnet.ibm.com> writes:
>>>> Are you worried that adding it to PMB will increase compile time?
>>>> The pass exits very early for any target that doesn't opt-in to doing
>>>> scalarisation at the IR level, without even looking at the function.
>>> 
>>> As an alternative, adding Scalarizer and InstCombine passes to
>>> SystemZPassConfig::addIRPasses() would probably give me most of the
>>> benefit without affecting the PMB.  Scalarizer itself would then not
>>> test TargetTransformInfo at all, at least in the initial version,
>>> and the scalarisation would still logically be done by codegen.
>>> Would that be OK?
>> 
>> I actually prefer that the Scalarizer would not touch TTI at all because
>> I view scalarization a canonicalization phase for DSLs, much like SROA
>> breaks structs.
> 
> That's what Pekka is thinking of using it for, but it wasn't the reason
> I wrote it.  The original motivation was llvmpipe, which is a rasteriser
> rather than a DSL compiler.  The motivation wasn't to canonicalise,
> it was to do the same thing that codegen currently does, but in a better
> place from an optimisation perspective.
> 
> You said in an earlier message:
> 
>  Other users of LLVM (such as OpenCL JITs) do scalarize early in the
>  optimization pipeline because the problem-domain presents lots of
>  vectors that needs to be legalized.
> 
> But:
> 
> (a) Scalarising and revectorising only makes sense if the vectorisation
>    is done with the target in mind.  If going from scalar code to vector
>    code can depend on the target, why shouldn't the same be true in the
>    other direction, for targets without vector support?
> 
> (b) The situation you describe isn't the one that applies to llvmpipe.
>    In llvmpipe the vectors are nice, known widths that are under the
>    driver's own control.  We certainly don't want to scalarise and
>    revectorise llvmpipe IR on x86_64, or on powerpc with Altivec/VSX.
>    The original code is already well vectorised for those targets.
>    (And also for ARM NEON I expect.)
> 
>    In the llvmpipe case, codegen's type legaliser already makes a good
>    decision about what to scalarise and what not to scalarise, without
>    any help from llvmpipe.  The problem I'm trying to solve is that
>    codegen is too late to get the benefit of other IR optimisations.
> 
>    So in my case I do not want to _change_ the decision about which
>    vectors get scalarised and how.  I just want to do it earlier.
>    It would be a shame if that meant that llvmpipe had to duplicate
>    exactly the decisions that codegen makes wrt scalarisation,
>    since codegen can easily make those decisions available through
>    TargetTransformInfo.
> 
> That's why I thought using TTI in the Scalarizer was a good thing
> in principle, at least as an option.
> 
> SystemZ is a simple case because there is no vector support.  But take MIPS
> (which is often a good example when it comes to complicated possibilities :-)).
> It has at least four separate vector extensions:
> 
>  - <2 x float> support from the MIPS V floating-point extensions,
>    carried over to MIPS 32/64.
> 
>  - <8 x i8> and <4 x i16> support from the optional MDMX extension,
>    now deprecated but used on older chips like the SB-1 and (in a
>    modified form) the VR5400.
> 
>  - Processor-specific vector extensions for the Loongson range.
> 
>  - The new MSA ASE.
> 
> That's a lot of possiblities.  Maybe the LLVM port will never support
> Loongson and MDMX (almost certain for the latter), but the point is that
> even if it did support them, the current codegen interface would make the
> right decisions about which of the llvmpipe vectors should be scalarised
> and how.
> 
> If Scalarizer is an all-or-nothing pass then it cannot make as good a
> decision for llvmpipe IR, where we don't expect to revectorise the result.
> Obviously the current pass is all-or-nothing anyway, but I tried to
> structure it so that it would be easy to make per-type decisions in
> the future, based on the TargetTransformInfo.
> 
> I realise I'm not going to convince you, and I'm going to make the
> change anyway.  I still think it's the wrong direction though.
> 
> Thanks,
> Richard
>