[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)
David Green via cfe-commits
cfe-commits at lists.llvm.org
Mon Aug 12 09:44:37 PDT 2024
davemgreen wrote:
AArch64 has a udot and sdot instruction (and a usdot instruction). They perform a "partial" reduction though, producing a v4i32 from two v16i8 inputs. We would like to use those from the vectorizer and have recently added a partial-reduction intrinsic, but doing it with a higher level intrinsic might be a little nicer.
It would seem like a "udot" can be represented already as `vecreduce.add(mul(zext, zext))`, and fdot is simpler still. Is there any particular reason to add a new intrinsic for it if it is already representable as a vecreduce? And it would feel like a shame if it couldn't be used with the actual AArch64 instructions.
@SamTebbs33 @NickGuy-Arm FYI.
https://github.com/llvm/llvm-project/pull/102872
More information about the cfe-commits
mailing list