[LLVMdev] [PATCH] Add a Scalarize pass
Richard Sandiford
rsandifo at linux.vnet.ibm.com
Fri Nov 15 09:18:14 PST 2013
Nadav Rotem <nrotem at apple.com> writes:
> The discussion on llvmpipe is irrelevant. llvmpipe has its own pass
> manager and optimization pipe, it is not a C compiler.
Note that this reply was about whether TargetTransformInfo should be
used in Scalarizer, not whether Scalarizer should be in PMB. I was
trying to explain why I thought that not testing TargetTransformInfo in
Scalarizer would make the pass less useful for llvmpipe's optimisation pipe.
Thanks,
Richard
> On Nov 15, 2013, at 3:26 AM, Richard Sandiford
> <rsandifo at linux.vnet.ibm.com> wrote:
>
>> Nadav Rotem <nrotem at apple.com> writes:
>>> On Nov 14, 2013, at 2:32 PM, Richard Sandiford
>>> <rsandifo at linux.vnet.ibm.com> wrote:
>>>> Richard Sandiford <rsandifo at linux.vnet.ibm.com> writes:
>>>>> Are you worried that adding it to PMB will increase compile time?
>>>>> The pass exits very early for any target that doesn't opt-in to doing
>>>>> scalarisation at the IR level, without even looking at the function.
>>>>
>>>> As an alternative, adding Scalarizer and InstCombine passes to
>>>> SystemZPassConfig::addIRPasses() would probably give me most of the
>>>> benefit without affecting the PMB. Scalarizer itself would then not
>>>> test TargetTransformInfo at all, at least in the initial version,
>>>> and the scalarisation would still logically be done by codegen.
>>>> Would that be OK?
>>>
>>> I actually prefer that the Scalarizer would not touch TTI at all because
>>> I view scalarization a canonicalization phase for DSLs, much like SROA
>>> breaks structs.
>>
>> That's what Pekka is thinking of using it for, but it wasn't the reason
>> I wrote it. The original motivation was llvmpipe, which is a rasteriser
>> rather than a DSL compiler. The motivation wasn't to canonicalise,
>> it was to do the same thing that codegen currently does, but in a better
>> place from an optimisation perspective.
>>
>> You said in an earlier message:
>>
>> Other users of LLVM (such as OpenCL JITs) do scalarize early in the
>> optimization pipeline because the problem-domain presents lots of
>> vectors that needs to be legalized.
>>
>> But:
>>
>> (a) Scalarising and revectorising only makes sense if the vectorisation
>> is done with the target in mind. If going from scalar code to vector
>> code can depend on the target, why shouldn't the same be true in the
>> other direction, for targets without vector support?
>>
>> (b) The situation you describe isn't the one that applies to llvmpipe.
>> In llvmpipe the vectors are nice, known widths that are under the
>> driver's own control. We certainly don't want to scalarise and
>> revectorise llvmpipe IR on x86_64, or on powerpc with Altivec/VSX.
>> The original code is already well vectorised for those targets.
>> (And also for ARM NEON I expect.)
>>
>> In the llvmpipe case, codegen's type legaliser already makes a good
>> decision about what to scalarise and what not to scalarise, without
>> any help from llvmpipe. The problem I'm trying to solve is that
>> codegen is too late to get the benefit of other IR optimisations.
>>
>> So in my case I do not want to _change_ the decision about which
>> vectors get scalarised and how. I just want to do it earlier.
>> It would be a shame if that meant that llvmpipe had to duplicate
>> exactly the decisions that codegen makes wrt scalarisation,
>> since codegen can easily make those decisions available through
>> TargetTransformInfo.
>>
>> That's why I thought using TTI in the Scalarizer was a good thing
>> in principle, at least as an option.
>>
>> SystemZ is a simple case because there is no vector support. But take MIPS
>> (which is often a good example when it comes to complicated possibilities :-)).
>> It has at least four separate vector extensions:
>>
>> - <2 x float> support from the MIPS V floating-point extensions,
>> carried over to MIPS 32/64.
>>
>> - <8 x i8> and <4 x i16> support from the optional MDMX extension,
>> now deprecated but used on older chips like the SB-1 and (in a
>> modified form) the VR5400.
>>
>> - Processor-specific vector extensions for the Loongson range.
>>
>> - The new MSA ASE.
>>
>> That's a lot of possiblities. Maybe the LLVM port will never support
>> Loongson and MDMX (almost certain for the latter), but the point is that
>> even if it did support them, the current codegen interface would make the
>> right decisions about which of the llvmpipe vectors should be scalarised
>> and how.
>>
>> If Scalarizer is an all-or-nothing pass then it cannot make as good a
>> decision for llvmpipe IR, where we don't expect to revectorise the result.
>> Obviously the current pass is all-or-nothing anyway, but I tried to
>> structure it so that it would be easy to make per-type decisions in
>> the future, based on the TargetTransformInfo.
>>
>> I realise I'm not going to convince you, and I'm going to make the
>> change anyway. I still think it's the wrong direction though.
>>
>> Thanks,
>> Richard
>>
More information about the llvm-dev
mailing list