[LLVMdev] Re: idea 10

Thu Jan 8 08:20:03 PST 2004

> I see more precisely what you mean, but I don't think it is that 
> straightforward to generalise the benefits multiple CPU on single host 
> programming to multiple CPU at multiple hosts. I don't think that both 
> cases involve the same techniques.

you are right, just think of shared memory.

> For example, in "single host" configuration you get a very low cost for 
> communicating data because you use memory instead of network. Memory is 
> [...]

you are right, that's exactly the point. 
I'd never tell that the difference is tiny.

> 
> What would you consider as the core primitives of single host, multi-CPU 
> programming ?

if LLVM would be really "low-level", then 
I would consider a generalized CALL as such a primitive, 
i.e. where not only address of subroutine is supplied, 
but the address of host as well.

something like 

call host:subr_address

However LLVM is very high-level !

It is so high-level, that I'd propose ...not include 
any primitives at LLVM language at all ! 
(So, Se'bastien, sry for answering "rather yes" about UCP)

Indeed, let's consider Fib example:

//------------
int f(int n) {
  if(n<2) return 1;
  return f(n-1) + f(n-2);
}
//------------

in this LLVM byte code output:

---------------------------------------------------
target endian = little
target pointersize = 32

implementation   ; Functions:

int %f(int %n) {
entry:
	br label %tailrecurse

tailrecurse:		; preds = %entry, %endif
	%cann-indvar = phi uint [ 0, %entry ], [ %next-indvar, %endif ]		; <uint> [#uses=2]
	%accumulator.tr = phi int [ 1, %entry ], [ %tmp.9, %endif ]		; <int> [#uses=2]
	%cann-indvar = cast uint %cann-indvar to int		; <int> [#uses=1]
	%n.tr-scale = mul int %cann-indvar, -2		; <int> [#uses=1]
	%n.tr = add int %n.tr-scale, %n		; <int> [#uses=2]
	%next-indvar = add uint %cann-indvar, 1		; <uint> [#uses=1]
	%tmp.1 = setle int %n.tr, 1		; <bool> [#uses=1]
	br bool %tmp.1, label %then, label %endif

then:		; preds = %tailrecurse
	ret int %accumulator.tr

endif:		; preds = %tailrecurse
	%tmp.5 = add int %n.tr, -1		; <int> [#uses=1]
	%tmp.3 = call int %f( int %tmp.5 )		; <int> [#uses=1]
	%tmp.9 = add int %tmp.3, %accumulator.tr		; <int> [#uses=1]
	br label %tailrecurse
}
---------------------------------------------------

BTW, it was quite impressive for me, that I've found 
only a one call in the LLVM output... :)

Anyway, above we have:

  call int %f( int %tmp.5 )

and it is a quite high-level statement i'd say!

Indeed, LLVM will allocate registers and regular memory for us.
This output is free of any strategy of this allocation.
We do *not* define strategy of this allocation and we are even quite happy with that!

Therefore I could expect, that allocation of CPU could be also a internal back-end problem of the LLVM like registers.

In other words, indirectly, I propose to treat CPUs as the 
same resource class as memory :-)

Chris, LLVMers, could you allocate CPUs resources as you do it for registers and memory? :P

Actually, maybe the problem is that scince is not ready arround 
phi-functions, SSA and so on, if we speak about calculating 
machine with several CPUs spread via network... What could 
LLVM-gurus say here?

--
Valery.