[LLVMdev] Add/sub with carry; widening multiply

Chris Lattner sabre at nondot.org
Wed Nov 21 17:59:53 PST 2007

On Wed, 21 Nov 2007, Jonathan Brandmeyer wrote:
> I've been playing around with llvm lately and I was wondering something 
> about the bitcode instructions for basic arithmetic.  Is there any plan 
> to provide instructions that perform widening multiply, or add with 
> carry?

Nope, there is no need: see below,

> Alternatively, would something like following get reduced to a single 
> multiply and two stores on arch's that support wide multiplies, like 
> x86-32 and ARM?

Yes, it already does.  Here's a compilable example:

define void @mulw(i32* %hidest, i32* %lodest, i32 %lhs, i32 %rhs) {
   sext i32 %lhs to i64
   sext i32 %rhs to i64
   mul i64 %0, %1
   trunc i64 %2 to i32
   lshr i64 %2, 32
   trunc i64 %4 to i32
   store i32 %3, i32* %lodest
   store i32 %5, i32* %hidest
   ret void

on x86-32, I get one multiply: llvm-as < t.ll | llc -march=x86

 	movl	12(%esp), %eax
 	imull	16(%esp)
 	movl	8(%esp), %ecx
 	movl	%eax, (%ecx)
 	movl	4(%esp), %eax
 	movl	%edx, (%eax)

ppc32 requires two, because the ISA doesn't have a high and low multiply: 
llvm-as < t.ll | llc -march=ppc32

 	mullw r2, r5, r6
 	mulhw r5, r5, r6
 	stw r2, 0(r4)
 	stw r5, 0(r3)

arm only requires one:  llvm-as < t.ll | llc -march=arm
 	smull r3, r2, r2, r3
 	str r3, [r1]
 	str r2, [r0]
 	bx lr


Right now we support up to i64, but we plan to support beyond i64 to 
arbitrary sizes in the future.  Right now we do have partial (but 
incomplete) support for i128.  For example:

define void @mulw(i128* %lhs, i128* %rhs, i128* %dst) {
   %LHS = load i128* %lhs
   %RHS = load i128* %rhs
   %Res = add i128 %LHS, %RHS
   store i128 %Res, i128* %dst
   ret void

compiles to: llvm-as < t.ll | llc -march=ppc32

 	lwz r2, 0(r4)
 	lwz r6, 4(r4)
 	lwz r7, 8(r4)
 	lwz r4, 12(r4)
 	lwz r8, 0(r3)
 	lwz r9, 4(r3)
 	lwz r10, 8(r3)
 	lwz r3, 12(r3)
 	addc r3, r3, r4
 	adde r4, r10, r7
 	adde r6, r9, r6
 	adde r2, r8, r2
 	stw r2, 0(r5)
 	stw r6, 4(r5)
 	stw r4, 8(r5)
 	stw r3, 12(r5)

which is "perfect".

I'd much rather finish up support for arbitrarily wide integers than add 
special purpose code for add/sub with carry, etc.  If you notice any cases 
where this technique does not produce optimal code, please let us know!



More information about the llvm-dev mailing list