[LLVMdev] Matching addsub
gohman at apple.com
Tue Oct 18 10:51:44 PDT 2011
On Oct 17, 2011, at 6:40 PM, Hal Finkel wrote:
> On Mon, 2011-10-17 at 17:33 -0700, Dan Gohman wrote:
>> On Oct 17, 2011, at 3:40 PM, Hal Finkel wrote:
>>> How should I go about matching floating-point addsub-like vector
>>> instructions? My first inclination is to write something which matches
>>> build_vector 1.0, -1.0, and then use that in combination with a match on
>>> fadd, but that does not seem to work. I think this is because
>>> BUILD_VECTOR cannot ever be "Legal", and so it is always turned into a
>>> constant load before instruction selection.
>> Trying to keep a vector producer this naive about the target instruction
>> set is awkward. If the vector producer doesn't know whether the target
>> has an addsub or subadd instruction, how is it to know the best way to
>> expand complex multiplication? It's likely to get suboptimal code in many
>> On the other hand, a producer that knows that the target has certain
>> instructions could pretty easily just use appropriate intrinsics that can
>> be mapped directly to the desired instructions. This way, there's no need
>> for it to play pictionary with the backend, drawing out its desired
>> semantics in terms of primitive operations and expecting codegen to
>> rediscover what was meant by pattern-matching.
> In general, I agree with you. It is always questionable how far to
> attempt to go with idiom recognition. However, some kind of combination
> add/subtract is a common vector operation, and so I seem to be left with
> three choices:
> 1. Decompose the operation and then attempt to recognize the
> 2. Add an additional LLVM instruction.
> 3. Add a number of target-specific special cases into the higher-level
> I am not sure which is better, but I'd prefer to stay away from choice
> (3) as much as practical in favor of one of the first two options. Would
> you support adding some kind of vector_faddsub LLVM instruction?
I assume you mean an operator that subtracts even elements and adds odd
elements; correct me if I'm wrong. I agree that this operation is fairly
common; I think a general intrinsic to do this would be useful for people
working on such targets.
> Also, there is a precedent, in some sense, for choices (1) and (2) in
> how fneg %a is serialized (as fsub -0.0, %a). Perhaps we could recognize
> fadd (fmul <-1.0, 1.0>), %b and turn it into something else for
> instruction selection in a similar way.
The problem with "fadd(fmul <-1.0,1.0>, %a), %b" is that it requires a
separate fmul instruction at the LLVM IR level, and you never want to
actually execute an fmul. If optimization obscures the pattern, you end
up with an unwanted fmul.
The fneg case is simpler because fsub with a constant is just one instruction
in LLVM IR. FWIW, the lack of a proper way to do fneg also happens to
be a bug.
More information about the llvm-dev