[PATCH] D10964: [Codegen] Add intrinsics 'hadd*' and corresponding SDNodes for horizontal sum operation.
Bruno Cardoso Lopes
bruno.cardoso at gmail.com
Thu Jul 23 07:07:18 PDT 2015
bruno added a comment.
> IMO, the scalar version is more natural w.r.t the HADD operation itself and also more canonical. Also, as you mentioned vector version will need an extractelement which may have some performance impact also.
> In fact on X86, we need to do a DAGCombine of *ABSDIFF* and *HADD* to generate PSAD instruction which is our main objective for adding the two intrinsics. I feel vector version of HADD will complicate this DAGCombine.
I don't see how making it return a vector type would affect performance. You will need a DAG combine anyway here and you don't need to deal with the extractelement at all, besides on the case where you really want to bring it back to a scalar result, this should be up to the front-end to generate. Also, we have precedence in using inserts/extracts with other vector nodes as the canonical way to represent lower level vector instructions.
My point is that if we make the result available in the vector we have two advantages: (1) if there are additional vector operations on the result, we don't need to re-insert it into another vector to continue vector work and (2) if you want the result on a scalar, you might just use extractelement.
In x86, PSAD returns a vector and if you want to keep the result on a scalar you need to generate the appropriate movs to do the work, I believe the same applies to other archs with similar instructions. IMO, it doesn't seem natural to always return the value on a scalar and then having to insert it back to a vector to proceed with vector work.
Regarding the tests, please split and rename it to vector-hadd-128.ll and vector-hadd-256.ll, no need to split files based on the fact that they are testing the expansion. Once you have custom versions, just drive them by subtarget features (+ssse3, sse4*, etc). Take a look at vector-blend.ll and others for examples.
More information about the llvm-commits