[clang] [llvm] Clang: convert `__m64` intrinsics to unconditionally use SSE2 instead of MMX. (PR #96540)

Wed Jun 26 08:12:18 PDT 2024

jyknight wrote:

> Really, the question is whether we plan to completely drop support for the x86_mmx type (including inline asm operands/results)

Yes, I do think it would be good to eliminate the type.

For inline-asm, we could switch to using a standard IR vector type for "y" constraint operands/results, and teach IR lowering to copy to/from MMX registers at the border. This is basically what Clang does already, at the IR level; we'd be pushing that down into IR lowering. It would have some minor performance impact on anything passing MMX values directly between two inline-asm statements (redundant movdq2q/movq2dq), but, so far as I can tell, almost nobody ever uses the "y" constraint anyways -- mmx registers are more often loaded/stored to memory inside of an inline-asm, instead. Also, clearly nothing still using MMX can be _that_ performance sensitive, or it would've been migrated to SSE/AVX sometime in the last 20 years.

One more option which is made trivial by eliminating the x86_mmx IR type would be to insert an "emms" after the return-value extraction for all inline-asm statements which are marked with either mmx clobbers or a "y" constraint. It would be trivial at that point -- there'd no longer be any need for any special logic to track where to insert it, since we can be sure there is not any live MMX state to worry about. That comes with more potential for performance impact, of course. (but, again, maybe that doesn't actually matter).

https://github.com/llvm/llvm-project/pull/96540