[LLVMdev] LowerPacked pass
Morten Ofstad
morten at hue.no
Fri Nov 19 03:11:25 PST 2004
Chris Lattner wrote:
> Note that packed support in LLVM is not complete yet. In
> particular, here are some of the big missing pieces:
>
> 1. No code generators can generate vector instructions yet (SSE or
> altivec, for example). This should be fairly easy to add though.
> 2. The lowerpacked pass, which currently converts packed ops into their
> scalar counterparts, has a few limitations:
> A. It does not handle packed arguments to functions
> B. It always lowers all of the way to scalar ops, even if the target
> supports SOME packed types. For example, it would be nice for it
> to eventually lower <16 x float> into 4 <4 x float>'s if the
> target supports them.
> C. It has never been thoroughly tested, primarily because we don't
> have a producer of packed operations yet. I believe it should
> work reasonably well though.
It works reasonably well, quite impressive really considering it's not
been tested ;-) B is not much of a problem for my use, but A is a bit
annoying even though I mostly pass pointers to packed types anyway. Can
you elaborate a bit on what is the problem with this? I have calls going
back into our code by adding mappings to the JIT, but I'm not sure if I
can get it to call functions with R32x4 (<float x 4>) args without
making a wrapper that takes a pointer.
> For your work, it might be most expedient to just ignore the lower packed
> pass and add SSE support to the X86 backend: that will get you up and
> running quickly and get you the performance you are obviously after. If
> backwards compatibility with old hardware is an issue, revisiting the
> lower packed pass would make sense.
Is it easy to add intrinsics to do things like dot product of packed
types using SSE instructions? That's probably all I need...
> Let me know what you think. In the very short term, the hook exposed to
> create the lower packed pass can be plunked into the X86TargetMachine and
> get intra function packed types working for you.
The patch you did was missing the actual implementation of
createLowerPackedPass, so I'm including my own differences -- I guess
you don't want to apply the changes to X86TargetMachine as I'm the only
one actually generating packed types, but I include it for completeness..
m.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: lowerpacked.patch.txt
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20041119/5fab9aff/attachment.txt>
More information about the llvm-dev
mailing list