[LLVMdev] [x86 codegen] 3DNow! intrinsics not behaving as expected.

Michael Spencer bigcheesegs at gmail.com
Thu Apr 14 12:16:45 PDT 2011

I finally got all of the 3DNow! instruction intrinsics and builtins
into LLVM and Clang, however, while testing them, I've noticed that
they produce incorrect results.

For example:

typedef float V2f __attribute__((vector_size(8)));

int main() {
  V2f dest, a = {1.0, 3.0}, b = {10.0, 3.5};
  dest = __builtin_ia32_pfadd(a, b);
  printf("(%f, %f)\n", dest[0], dest[1]);

Should output (11, 6.5). However, it outputs different values
depending on the optimization level. Generally one of them is correct,
and the other is -nan.

I looked at the program using a debugger, and the pfadd instruction is
executed correctly and the MMX register contains the correct values.
The code that prepares the stack for the printf call seems to be
messing it up.

Here's the assembly generated at O3 for the above:

	.file	"intrin.c"
	.globl	main
	.align	16, 0x90
	.type	main, at function
main:                                   # @main
# BB#0:                                 # %entry
	pushl	%ebp
	movl	%esp, %ebp
	subl	$56, %esp
	movl	$1077936128, -12(%ebp)  # imm = 0x40400000
	movl	$1065353216, -16(%ebp)  # imm = 0x3F800000
	movl	$1080033280, -4(%ebp)   # imm = 0x40600000
	movl	$1092616192, -8(%ebp)   # imm = 0x41200000
	movq	-16(%ebp), %mm0
	pfadd	-8(%ebp), %mm0
	movq	%mm0, -24(%ebp)
	flds	-20(%ebp)
	fstpl	12(%esp)
	flds	-24(%ebp)
	fstpl	4(%esp)
	movl	$.L.str, (%esp)
	calll	printf
	xorl	%eax, %eax
	addl	$56, %esp
	popl	%ebp
	.size	main, .Ltmp0-main

	.type	.L.str, at object          # @.str
	.section	.rodata.str1.1,"aMS", at progbits,1
	.asciz	 "%f, %f\n"
	.size	.L.str, 8

	.section	".note.GNU-stack","", at progbits

Attached are my patches to enable support for this. I'd like to be
done with this, because 3DNow! isn't even supported anymore. I was
just adding these to learn tblgen and fill in some of the MSVC
intrinsic headers.

- Michael Spencer
