[LLVMdev] avoid live range overlap of "vector" registers

Chris Lattner sabre at nondot.org
Tue May 10 21:02:53 PDT 2005


On Wed, 11 May 2005, Tzu-Chien Chiu wrote:

> On  Tue May 10 2005, Chris Lattner wrote:
>> On Tue, 10 May 2005, Morten Ofstad wrote:
>>> Actually, I think it would be better to define the registers as a machine
>>> value type for packed float x4, and providing some 'extract' and 'inject'
>>> instructions to access individual components... There should also be a
>>> 'shuffle' instruction (corresponding to the SSE PSHUF instruction) to change
>>> the individual components around.
>>
>> You're right, that would be a better way to go.  To start, I would suggest
>> adding extract/inject intrinsics (not instructions) because it is easier.
>> If you're interested in doing this, there is documentation for this here:
>
> quote <http://llvm.cs.uiuc.edu/docs/LangRef.html#intrinsics>:
> "To do this, extend the default implementation of the
> IntrinsicLowering class to handle the intrinsic. Code generators use
> this class to lower intrinsics they do not understand to raw LLVM
> instructions that they do."
>
> but to which llvm instructions should the extract/inject (or
> shuffle/pack) intrinsics be lowered? llvm instruction does not allow
> to access the individual scalar value in a packed value.

None, that documentation is out of date and doesn't make a ton of sense 
for your application.  I would suggest that you implement it in the 
context of the SelectionDAG framework that all of the code generators 
either currently use or are moving to.  I updated the documentation here: 
http://llvm.cs.uiuc.edu/ChrisLLVM/docs/ExtendingLLVM.html#intrinsic

This will allow you to do something like this:

%i32v4 = type <4 x uint>

%f32v4 = type <4 x float>

declare %f32v4 %swizzle(%f32v4 %In, %i32v4 %Form)

%G = external global %f32v4

void %test() {
         %A = load %f32v4* %G
         %B = call %f32v4 %swizzle(%f32v4 %A, %i32v4 <uint 1, uint 1, uint 1, uint 1>)   ;; splat XYZW -> YYYY
         store %f32v4 %B, %f32v4* %G
         ret void
}

... Except using llvm.swizzle instead of 'swizzle'.

Unfortunately the code generator currently does not support packed types, 
so this will require some work.  However, this certainly is the closest 
match for your model.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.cs.uiuc.edu/




More information about the llvm-dev mailing list