[PATCH] D30086: Add generic IR vector reductions

Tue Feb 21 04:42:36 PST 2017

rengolin added a comment.

In https://reviews.llvm.org/D30086#681823, @mkuper wrote:

> What I'd really like to avoid is having a nominally "target-independent" intrinsic, which can, in practice, only be handled by one or two targets.

Agreed.

> a) If we know this is not the right generic representation, this should not go in as a target-independent intrinsic. I haven't seen anyone raise any "deal-breaker" issues so far, though.

This seems to tick all the boxes, and have been reviewed for both ARM back-ends (including the future SVE) and x86 (including the future AVX1024).

We should really have people from other targets to make sure this makes sense, but at this point, I think the changes will be minimal, since any architecture specific change can be done at the lowering stage on their own back-ends.

> b) If we think this is the right generic representation, but we're not sure, we can introduce these intrinsics as llvm.experimental. For an experimental intrinsic, I think it makes more sense to have targets opt-in like this patch does (but we probably still want default lowering for targets that don't opt-in).

I think, for now, this is the best thing to do.

The log2 shuffle pattern is canonical and it works well with the "lower how you can" premise already. It may be cumbersome for AVX512 (long pyramid), but it's not unbearable. It will become unbearable for AVX-1024 and impossible for SVE to use the same pattern, so those targets can begin using it on the side, and then we can extend to the remaining targets on their pace.

The alternative is to have an **SVE/AVX specific intrinsic**, which means the same thing as the pyramid, and doesn't need to be handled by other targets. Clang (via intrinsics) would generate them as well as the vectorisers, and that would be completely opaque to any other pass (as external function calls). But this would make it harder to extend support for the other smaller vector sizes (renames, tests, etc) if we want to move it to the canonical form.

I personally think that this (concept) is a good representation of a horizontal reduction, so could easily be a canonical form one day. The particular shape may need to change, but that's orthogonal.

> c) If we are sure about this, then we should do just it for all targets, and have late lowering by default. I'm not really the right person to determine that, though. I guess that's a question for the backend owners.

If we make it experimental, we don't need to push any target to support it ASAP. Furthermore, both SVE and AVX 1024 are, themselves, "experimental", and this change is mostly for their benefit, so I don't see why this **shouldn't** be experimental at this stage. It'll also give us time to change and other targets to adapt it if needed.

Repository:
  rL LLVM

https://reviews.llvm.org/D30086