[PATCH] D14588: [X86][SSE] Transform truncation from v8i32/v16i32 to v8i8/v16i8 into bitand and X86ISD::PACKUS operations during DAG combine.

Mon Nov 16 11:09:20 PST 2015

congh added a comment.

In http://reviews.llvm.org/D14588#289556, @RKSimon wrote:

> I've added the current codegen to the vector-trunc.ll tests for comparison so please can you rebase against that?

I have rebased my repo and all tests are passes with this patch.

> I wonder if it would be better to combine to bitcast/shuffle pairs instead of specific X86ISD nodes? And then focus on improving the existing shuffle lowering with PACKUS (e.g. I don't think we're making use of PACKUSDW at all yet).

Without this patch, the truncation from v16i32 to v16i8 is first converted to many extract-element, truncation on scalars, and a build_vector after type legalization, which is very difficult to be lowered into PACKUS (at least the lowering is far from elegant). So what you are suggesting is first combining it into a shuffle on v64i8 that is bitcast from v16i32, right? I need to try this method and see if it is easy to lower the instructions after type legalization.

http://reviews.llvm.org/D14588