[PATCH] D64460: AMDGPU: Add 24-bit mul intrinsics

Wed Jul 10 15:56:44 PDT 2019

arsenm added a comment.

In D64460#1579467 <https://reviews.llvm.org/D64460#1579467>, @rampitec wrote:

> Replacing it so early means we could miss other optimizations, especially on the DAG. Was any performance evaluation performed?

It's about as late as possible for the IR. This isn't really moving it that far from there this happens already, which is typically combine 1. The library function call is still using the standard IR operations, and I would expect the useful optimizations to have happened by this point. We already understand the known bits for these in the DAG, so I don't they shouldn't hurt too much (although they still may need ComputeNumSignBitsForTargetNode). It's possible combine opportunities will appear after lowering, and they will need to be implemented for the mul24 nodes.

I don't know what I would try this on, besides Jeff's benchmark that this is intended to solve. The only lit tests that broke were improvements.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64460/new/

https://reviews.llvm.org/D64460