[PATCH] Add a Scalarize pass
milseman at apple.com
Wed Nov 13 12:01:42 PST 2013
This comment pertains purely to implementation details of the pass.
I notice that you seem to lazily expand values (which may exist very far away from the current visit point) and just iterate over the blocks in any given order. Would you perhaps be better off if you could instead visit (almost) all the defs before the uses (caveat: backedges)? You should be able to use the ReversePostOrderTraversal class to visit (non-backedge) defs before their uses. For the backedge case, you could just defer those Phis until the end. Then you don't ever need any lazy expansion, and can just map vector values to scalar values.
I notice inside Scatterer's operator , you might return null at the bottom of the function. Is this safe? It seems like users pass on the result directly to the builder, which sounds dangerous. There's no test in this patch with dynamic InsertValues, so I'm not sure if it's a case you're planning on handling or not. Either way, a comment/assert would be nice :)
On Nov 7, 2013, at 6:34 AM, Richard Sandiford <rsandifo at linux.vnet.ibm.com> wrote:
> Thanks for the reviews. New in this version:
> - The pass is called Scalarizer rather than DecomposeVectors.
> - Alignment is taken into account.
> - Only known metadata is kept.
> - Loads and stores are kept as-is if the elements are not a whole number of bytes in size. It's still useful to decompose vectors of <N x i1> comparison results though -- in fact that was one of the original motivations.
> - Vector GEPs are scalarised too. (I hadn't realised they were allowed, thanks.)
> - There are more tests. Let me know if there's something else I should be testing for though.
> I tried adapting the llvm-stress wrapper in:
> to run opt at -O3 with -enable-scalarizer -scalarize-load-store. The number of runs and instruction sizes were both 1000. No problems were reported.
> The pass doesn't yet try to scalarise function calls. At the moment we just gather/reassemble vectors whenever we have a use that isn't understood. That's enough for my target but it could be extended in future. (In the SystemZ llvmpipe code that I'm working with, the only function call is to sqrt.v4f32, but these testcases wouldn't benefit by scalarising that early.)
> CHANGE SINCE LAST DIFF
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
More information about the llvm-commits