[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

Tue Aug 13 02:07:01 PDT 2024

SamTebbs33 wrote:

> > It would seem like a "udot" can be represented already as `vecreduce.add(mul(zext, zext))`, and fdot is simpler still. Is there any particular reason to add a new intrinsic for it if it is already representable as a vecreduce? And it would feel like a shame if it couldn't be used with the actual AArch64 instructions.
> 
> There was a whole discussion on dot in https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294/13 check out `kparzysz` posts. Essentially Yes we could represent dot this way, but then we would not be able to benefit from the ubquity of the hardware specific dot lowerings that are showing up across gpu and convolution use cases.
> 

Why would using the partial reduction intrinsic stop you from using hardware-specific dot product lowerings for GPUs? The lowering is quite trivial, see [here](https://github.com/llvm/llvm-project/pull/101010). I think it would be best to not introduce another way of doing the same thing.

https://github.com/llvm/llvm-project/pull/102872