[PATCH] [x86] Improve build_vector v8i16 codegen
hfinkel at anl.gov
Mon Jan 26 10:06:33 PST 2015
----- Original Message -----
> From: "Bruno Cardoso Lopes" <bruno.cardoso at gmail.com>
> To: "bruno cardoso" <bruno.cardoso at gmail.com>, nrotem at apple.com, spatel at rotateright.com, anemet at apple.com,
> chandlerc at gmail.com
> Cc: llvm-commits at cs.uiuc.edu
> Sent: Monday, January 26, 2015 11:56:27 AM
> Subject: Re: [PATCH] [x86] Improve build_vector v8i16 codegen
> Thanks for the detailed measurements Quentin.
> I've received a report about a performance degradation regarding this
> code against hand-written pinsrw assembly and trusted the source =(
> I'll try it out myself and get back if it really shows up some
> performance improvements.
I'm not particularly familiar with the problem or the solution space, so take this with a grain of salt...
I've seen this kind of thing vary in utility depending on whether the instruction sequence in question is on the critical path or not. If it is on the critical path, then having a longer sequence with more ILP is normally a win. Otherwise, the shorter sequence is normally a win.
I'll also point out that we have the lib/CodeGen/MachineCombiner.cpp pass, that is specifically designed to do this kind of sequence substitution only in cases where we don't increase the critical path length. If my hypothesis is right, this would be a good potential use of that pass (which is theoretically target independent, although currently used only by AArch64).
> EMAIL PREFERENCES
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-commits